Hussey, Michael A; Koch, Gary G; Preisser, John S; Saville, Benjamin R
2016-01-01
Time-to-event or dichotomous outcomes in randomized clinical trials often have analyses using the Cox proportional hazards model or conditional logistic regression, respectively, to obtain covariate-adjusted log hazard (or odds) ratios. Nonparametric Randomization-Based Analysis of Covariance (NPANCOVA) can be applied to unadjusted log hazard (or odds) ratios estimated from a model containing treatment as the only explanatory variable. These adjusted estimates are stratified population-averaged treatment effects and only require a valid randomization to the two treatment groups and avoid key modeling assumptions (e.g., proportional hazards in the case of a Cox model) for the adjustment variables. The methodology has application in the regulatory environment where such assumptions cannot be verified a priori. Application of the methodology is illustrated through three examples on real data from two randomized trials.
A Nonparametric Prior for Simultaneous Covariance Estimation
Gaskins, Jeremy T.; Daniels, Michael J.
2013-01-01
Summary In the modeling of longitudinal data from several groups, appropriate handling of the dependence structure is of central importance. Standard methods include specifying a single covariance matrix for all groups or independently estimating the covariance matrix for each group without regard to the others, but when these model assumptions are incorrect, these techniques can lead to biased mean effects or loss of efficiency, respectively. Thus, it is desirable to develop methods to simultaneously estimate the covariance matrix for each group that will borrow strength across groups in a way that is ultimately informed by the data. In addition, for several groups with covariance matrices of even medium dimension, it is difficult to manually select a single best parametric model among the huge number of possibilities given by incorporating structural zeros and/or commonality of individual parameters across groups. In this paper we develop a family of nonparametric priors using the matrix stick-breaking process of Dunson et al. (2008) that seeks to accomplish this task by parameterizing the covariance matrices in terms of the parameters of their modified Cholesky decomposition (Pourahmadi, 1999). We establish some theoretic properties of these priors, examine their effectiveness via a simulation study, and illustrate the priors using data from a longitudinal clinical trial. PMID:24324281
A Nonparametric Analogy of Analysis of Covariance
ERIC Educational Resources Information Center
Burnett, Thomas D.; Barr, Donald R.
1977-01-01
A nonparametric test of the hypothesis of no treatment effect is suggested for a situation where measures of the severity of the condition treated can be obtained and ranked both pre- and post-treatment. The test allows the pre-treatment rank to be used as a concomitant variable. (Author/JKS)
Janes, Holly; Pepe, Margaret S
2009-06-01
Recent scientific and technological innovations have produced an abundance of potential markers that are being investigated for their use in disease screening and diagnosis. In evaluating these markers, it is often necessary to account for covariates associated with the marker of interest. Covariates may include subject characteristics, expertise of the test operator, test procedures or aspects of specimen handling. In this paper, we propose the covariate-adjusted receiver operating characteristic curve, a measure of covariate-adjusted classification accuracy. Nonparametric and semiparametric estimators are proposed, asymptotic distribution theory is provided and finite sample performance is investigated. For illustration we characterize the age-adjusted discriminatory accuracy of prostate-specific antigen as a biomarker for prostate cancer.
Combining biomarkers for classification with covariate adjustment.
Kim, Soyoung; Huang, Ying
2017-03-09
Combining multiple markers can improve classification accuracy compared with using a single marker. In practice, covariates associated with markers or disease outcome can affect the performance of a biomarker or biomarker combination in the population. The covariate-adjusted receiver operating characteristic (ROC) curve has been proposed as a tool to tease out the covariate effect in the evaluation of a single marker; this curve characterizes the classification accuracy solely because of the marker of interest. However, research on the effect of covariates on the performance of marker combinations and on how to adjust for the covariate effect when combining markers is still lacking. In this article, we examine the effect of covariates on classification performance of linear marker combinations and propose to adjust for covariates in combining markers by maximizing the nonparametric estimate of the area under the covariate-adjusted ROC curve. The proposed method provides a way to estimate the best linear biomarker combination that is robust to risk model assumptions underlying alternative regression-model-based methods. The proposed estimator is shown to be consistent and asymptotically normally distributed. We conduct simulations to evaluate the performance of our estimator in cohort and case/control designs and compare several different weighting strategies during estimation with respect to efficiency. Our estimator is also compared with alternative regression-model-based estimators or estimators that maximize the empirical area under the ROC curve, with respect to bias and efficiency. We apply the proposed method to a biomarker study from an human immunodeficiency virus vaccine trial. Copyright © 2017 John Wiley & Sons, Ltd.
Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.
Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F
2013-04-01
In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology.
Covariate-Adjusted Linear Mixed Effects Model with an Application to Longitudinal Data
Nguyen, Danh V.; Şentürk, Damla; Carroll, Raymond J.
2009-01-01
Linear mixed effects (LME) models are useful for longitudinal data/repeated measurements. We propose a new class of covariate-adjusted LME models for longitudinal data that nonparametrically adjusts for a normalizing covariate. The proposed approach involves fitting a parametric LME model to the data after adjusting for the nonparametric effects of a baseline confounding covariate. In particular, the effect of the observable covariate on the response and predictors of the LME model is modeled nonparametrically via smooth unknown functions. In addition to covariate-adjusted estimation of fixed/population parameters and random effects, an estimation procedure for the variance components is also developed. Numerical properties of the proposed estimators are investigated with simulation studies. The consistency and convergence rates of the proposed estimators are also established. An application to a longitudinal data set on calcium absorption, accounting for baseline distortion from body mass index, illustrates the proposed methodology. PMID:19266053
A Review of Nonparametric Alternatives to Analysis of Covariance.
ERIC Educational Resources Information Center
Olejnik, Stephen F.; Algina, James
1985-01-01
Five distribution-free alternatives to parametric analysis of covariance are presented and demonstrated: Quade's distribution-free test, Puri and Sen's solution, McSweeney and Porter's rank transformation, Burnett and Barr's rank difference scores, and Shirley's general linear model solution. The results of simulation studies regarding Type I…
Lin, Li-An; Luo, Sheng; Chen, Bingshu E.; Davis, Barry R.
2016-01-01
Multi-type recurrent event data occur frequently in longitudinal studies. Dependent termination may occur when the terminal time is correlated to recurrent event times. In this article, we simultaneously model the multi-type recurrent events and a dependent terminal event, both with nonparametric covariate functions modeled by B-splines. We develop a Bayesian multivariate frailty model to account for the correlation among the dependent termination and various types of recurrent events. Extensive simulation results suggest that misspecifying nonparametric covariate functions may introduce bias in parameter estimation. This method development has been motivated by and applied to the lipid-lowering trial (LLT) component of the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT). PMID:26546256
Gao, Feng; Manatunga, Amita K; Chen, Shande
2007-02-20
Often in many biomedical and epidemiologic studies, estimating hazards function is of interest. The Breslow's estimator is commonly used for estimating the integrated baseline hazard, but this estimator requires the functional form of covariate effects to be correctly specified. It is generally difficult to identify the true functional form of covariate effects in the presence of time-dependent covariates. To provide a complementary method to the traditional proportional hazard model, we propose a tree-type method which enables simultaneously estimating both baseline hazards function and the effects of time-dependent covariates. Our interest will be focused on exploring the potential data structures rather than formal hypothesis testing. The proposed method approximates the baseline hazards and covariate effects with step-functions. The jump points in time and in covariate space are searched via an algorithm based on the improvement of the full log-likelihood function. In contrast to most other estimating methods, the proposed method estimates the hazards function rather than integrated hazards. The method is applied to model the risk of withdrawal in a clinical trial that evaluates the anti-depression treatment in preventing the development of clinical depression. Finally, the performance of the method is evaluated by several simulation studies.
Ryu, Duchwan; Li, Erning; Mallick, Bani K
2011-06-01
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves.
Chiou, Jeng-Min; Liang, Kung-Yee; Chiu, Yen-Feng
2005-01-01
Multipoint linkage analysis using sibpair designs remains a common approach to help investigators to narrow chromosomal regions for traits (either qualitative or quantitative) of interest. Despite its popularity, the success of this approach depends heavily on how issues such as genetic heterogeneity, gene-gene, and gene-environment interactions are properly handled. If addressed properly, the likelihood of detecting genetic linkage and of efficiently estimating the location of the trait locus would be enhanced, sometimes drastically. Previously, we have proposed an approach to deal with these issues by modeling the genetic effect of the target trait locus as a function of covariates pertained to the sibpairs. Here the genetic effect is simply the probability that a sibpair shares the same allele at the trait locus from their parents. Such modeling helps to divide the sibpairs into more homogeneous subgroups, which in turn helps to enhance the chance to detect linkage. One limitation of this approach is the need to categorize the covariates so that a small and fixed number of genetic effect parameters are introduced. In this report, we take advantage of the fact that nowadays multiple markers are readily available for genotyping simultaneously. This suggests that one could estimate the dependence of the generic effect on the covariates nonparametrically. We present an iterative procedure to estimate (1) the genetic effect nonparametrically and (2) the location of the trait locus through estimating functions developed by Liang et al. ([2001a] Hum Hered 51:67-76). We apply this new method to the linkage study of schizophrenia to illustrate how the onset ages of each sibpair may help to address the issue of genetic heterogeneity. This analysis sheds new light on the dependence of the trait effect on onset ages from affected sibpairs, an observation not revealed previously. In addition, we have carried out some simulation work, which suggests that this method provides
ROC analysis in biomarker combination with covariate adjustment.
Liu, Danping; Zhou, Xiao-Hua
2013-07-01
Receiver operating characteristic (ROC) analysis is often used to find the optimal combination of biomarkers. When the subject level covariates affect the magnitude and/or accuracy of the biomarkers, the combination rule should take into account of the covariate adjustment. The authors propose two new biomarker combination methods that make use of the covariate information. The first method is to maximize the area under the covariate-adjusted ROC curve (AAUC). To overcome the limitations of the AAUC measure, the authors further proposed the area under covariate-standardized ROC curve (SAUC), which is an extension of the covariate-specific ROC curve. With a series of simulation studies, the proposed optimal AAUC and SAUC methods are compared with the optimal AUC method that ignores the covariates. The biomarker combination methods are illustrated by an example from Alzheimer's disease research. The simulation results indicate that the optimal AAUC combination performs well in the current study population. The optimal SAUC method is flexible to choose any reference populations, and allows the results to be generalized to different populations. The proposed optimal AAUC and SAUC approaches successfully address the covariate adjustment problem in estimating the optimal marker combination. The optimal SAUC method is preferred for practical use, because the biomarker combination rule can be easily evaluated for different population of interest. Published by Elsevier Inc.
Adjusting power for a baseline covariate in linear models
Glueck, Deborah H.; Muller, Keith E.
2009-01-01
SUMMARY The analysis of covariance provides a common approach to adjusting for a baseline covariate in medical research. With Gaussian errors, adding random covariates does not change either the theory or the computations of general linear model data analysis. However, adding random covariates does change the theory and computation of power analysis. Many data analysts fail to fully account for this complication in planning a study. We present our results in five parts. (i) A review of published results helps document the importance of the problem and the limitations of available methods. (ii) A taxonomy for general linear multivariate models and hypotheses allows identifying a particular problem. (iii) We describe how random covariates introduce the need to consider quantiles and conditional values of power. (iv) We provide new exact and approximate methods for power analysis of a range of multivariate models with a Gaussian baseline covariate, for both small and large samples. The new results apply to the Hotelling-Lawley test and the four tests in the “univariate” approach to repeated measures (unadjusted, Huynh-Feldt, Geisser-Greenhouse, Box). The techniques allow rapid calculation and an interactive, graphical approach to sample size choice. (v) Calculating power for a clinical trial of a treatment for increasing bone density illustrates the new methods. We particularly recommend using quantile power with a new Satterthwaite-style approximation. PMID:12898543
Role of Experiment Covariance in Cross Section Adjustments
Giuseppe Palmiotti; M. Salvatores
2014-06-01
This paper is dedicated to the memory of R. D. McKnight, which gave a seminal contribution in establishing methodology and rigorous approach in the evaluation of the covariance of reactor physics integral experiments. His original assessment of the ZPPR experiment uncertainties and correlations has made nuclear data adjustments, based on these experiments, much more robust and reliable. In the present paper it has been shown with some numerical examples the actual impact on an adjustment of accounting for or neglecting such correlations.
Inverse probability weighting for covariate adjustment in randomized studies.
Shen, Changyu; Li, Xiaochun; Li, Lingling
2014-02-20
Covariate adjustment in randomized clinical trials has the potential benefit of precision gain. It also has the potential pitfall of reduced objectivity as it opens the possibility of selecting a 'favorable' model that yields strong treatment benefit estimate. Although there is a large volume of statistical literature targeting on the first aspect, realistic solutions to enforce objective inference and improve precision are rare. As a typical randomized trial needs to accommodate many implementation issues beyond statistical considerations, maintaining the objectivity is at least as important as precision gain if not more, particularly from the perspective of the regulatory agencies. In this article, we propose a two-stage estimation procedure based on inverse probability weighting to achieve better precision without compromising objectivity. The procedure is designed in a way such that the covariate adjustment is performed before seeing the outcome, effectively reducing the possibility of selecting a 'favorable' model that yields a strong intervention effect. Both theoretical and numerical properties of the estimation procedure are presented. Application of the proposed method to a real data example is presented.
Sample Size for Confidence Interval of Covariate-Adjusted Mean Difference
ERIC Educational Resources Information Center
Liu, Xiaofeng Steven
2010-01-01
This article provides a way to determine adequate sample size for the confidence interval of covariate-adjusted mean difference in randomized experiments. The standard error of adjusted mean difference depends on covariate variance and balance, which are two unknown quantities at the stage of planning sample size. If covariate observations are…
Sample Size for Confidence Interval of Covariate-Adjusted Mean Difference
ERIC Educational Resources Information Center
Liu, Xiaofeng Steven
2010-01-01
This article provides a way to determine adequate sample size for the confidence interval of covariate-adjusted mean difference in randomized experiments. The standard error of adjusted mean difference depends on covariate variance and balance, which are two unknown quantities at the stage of planning sample size. If covariate observations are…
Nonparametric Estimation of Standard Errors in Covariance Analysis Using the Infinitesimal Jackknife
ERIC Educational Resources Information Center
Jennrich, Robert I.
2008-01-01
The infinitesimal jackknife provides a simple general method for estimating standard errors in covariance structure analysis. Beyond its simplicity and generality what makes the infinitesimal jackknife method attractive is that essentially no assumptions are required to produce consistent standard error estimates, not even the requirement that the…
Bayesian adjustment for covariate measurement errors: a flexible parametric approach.
Hossain, Shahadut; Gustafson, Paul
2009-05-15
In most epidemiological investigations, the study units are people, the outcome variable (or the response) is a health-related event, and the explanatory variables are usually environmental and/or socio-demographic factors. The fundamental task in such investigations is to quantify the association between the explanatory variables (covariates/exposures) and the outcome variable through a suitable regression model. The accuracy of such quantification depends on how precisely the relevant covariates are measured. In many instances, we cannot measure some of the covariates accurately. Rather, we can measure noisy (mismeasured) versions of them. In statistical terminology, mismeasurement in continuous covariates is known as measurement errors or errors-in-variables. Regression analyses based on mismeasured covariates lead to biased inference about the true underlying response-covariate associations. In this paper, we suggest a flexible parametric approach for avoiding this bias when estimating the response-covariate relationship through a logistic regression model. More specifically, we consider the flexible generalized skew-normal and the flexible generalized skew-t distributions for modeling the unobserved true exposure. For inference and computational purposes, we use Bayesian Markov chain Monte Carlo techniques. We investigate the performance of the proposed flexible parametric approach in comparison with a common flexible parametric approach through extensive simulation studies. We also compare the proposed method with the competing flexible parametric method on a real-life data set. Though emphasis is put on the logistic regression model, the proposed method is unified and is applicable to the other generalized linear models, and to other types of non-linear regression models as well. (c) 2009 John Wiley & Sons, Ltd.
Covariate-adjusted confidence interval for the intraclass correlation coefficient.
Shoukri, Mohamed M; Donner, Allan; El-Dali, Abdelmoneim
2013-09-01
A crucial step in designing a new study is to estimate the required sample size. For a design involving cluster sampling, the appropriate sample size depends on the so-called design effect, which is a function of the average cluster size and the intracluster correlation coefficient (ICC). It is well-known that under the framework of hierarchical and generalized linear models, a reduction in residual error may be achieved by including risk factors as covariates. In this paper we show that the covariate design, indicating whether the covariates are measured at the cluster level or at the within-cluster subject level affects the estimation of the ICC, and hence the design effect. Therefore, the distinction between these two types of covariates should be made at the design stage. In this paper we use the nested-bootstrap method to assess the accuracy of the estimated ICC for continuous and binary response variables under different covariate structures. The codes of two SAS macros are made available by the authors for interested readers to facilitate the construction of confidence intervals for the ICC. Moreover, using Monte Carlo simulations we evaluate the relative efficiency of the estimators and evaluate the accuracy of the coverage probabilities of a 95% confidence interval on the population ICC. The methodology is illustrated using a published data set of blood pressure measurements taken on family members. © 2013. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Sabourin, Stephane; Valois, Pierre; Lussier, Yvan
2005-01-01
The main purpose of the current research was to develop an abbreviated form of the Dyadic Adjustment Scale (DAS) with nonparametric item response theory. The authors conducted 5 studies, with a total participation of 8,256 married or cohabiting individuals. Results showed that the item characteristic curves behaved in a monotonically increasing…
ERIC Educational Resources Information Center
Sabourin, Stephane; Valois, Pierre; Lussier, Yvan
2005-01-01
The main purpose of the current research was to develop an abbreviated form of the Dyadic Adjustment Scale (DAS) with nonparametric item response theory. The authors conducted 5 studies, with a total participation of 8,256 married or cohabiting individuals. Results showed that the item characteristic curves behaved in a monotonically increasing…
ERIC Educational Resources Information Center
Safarkhani, Maryam; Moerbeek, Mirjam
2013-01-01
In a randomized controlled trial, a decision needs to be made about the total number of subjects for adequate statistical power. One way to increase the power of a trial is by including a predictive covariate in the model. In this article, the effects of various covariate adjustment strategies on increasing the power is studied for discrete-time…
ERIC Educational Resources Information Center
Nimon, Kim; Henson, Robin K.
2015-01-01
The authors empirically examined whether the validity of a residualized dependent variable after covariance adjustment is comparable to that of the original variable of interest. When variance of a dependent variable is removed as a result of one or more covariates, the residual variance may not reflect the same meaning. Using the pretest-posttest…
ERIC Educational Resources Information Center
Safarkhani, Maryam; Moerbeek, Mirjam
2013-01-01
In a randomized controlled trial, a decision needs to be made about the total number of subjects for adequate statistical power. One way to increase the power of a trial is by including a predictive covariate in the model. In this article, the effects of various covariate adjustment strategies on increasing the power is studied for discrete-time…
ERIC Educational Resources Information Center
Nimon, Kim; Henson, Robin K.
2015-01-01
The authors empirically examined whether the validity of a residualized dependent variable after covariance adjustment is comparable to that of the original variable of interest. When variance of a dependent variable is removed as a result of one or more covariates, the residual variance may not reflect the same meaning. Using the pretest-posttest…
Elze, Markus C; Gregson, John; Baber, Usman; Williamson, Elizabeth; Sartori, Samantha; Mehran, Roxana; Nichols, Melissa; Stone, Gregg W; Pocock, Stuart J
2017-01-24
Propensity scores (PS) are an increasingly popular method to adjust for confounding in observational studies. Propensity score methods have theoretical advantages over conventional covariate adjustment, but their relative performance in real-word scenarios is poorly characterized. We used datasets from 4 large-scale cardiovascular observational studies (PROMETHEUS, ADAPT-DES [the Assessment of Dual AntiPlatelet Therapy with Drug-Eluting Stents], THIN [The Health Improvement Network], and CHARM [Candesartan in Heart Failure-Assessment of Reduction in Mortality and Morbidity]) to compare the performance of conventional covariate adjustment with 4 common PS methods: matching, stratification, inverse probability weighting, and use of PS as a covariate. We found that stratification performed poorly with few outcome events, and inverse probability weighting gave imprecise estimates of treatment effect and undue influence to a small number of observations when substantial confounding was present. Covariate adjustment and matching performed well in all of our examples, although matching tended to give less precise estimates in some cases. PS methods are not necessarily superior to conventional covariate adjustment, and care should be taken to select the most suitable method. Copyright © 2017 American College of Cardiology Foundation. Published by Elsevier Inc. All rights reserved.
Least-Squares Data Adjustment with Rank-Deficient Data Covariance Matrices
Williams, J.G.
2011-07-01
A derivation of the linear least-squares adjustment formulae is required that avoids the assumption that the covariance matrix of prior parameters can be inverted. Possible proofs are of several kinds, including: (i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. In this paper, the least-squares adjustment equations are derived in both these ways, while explicitly assuming that the covariance matrix of prior parameters is singular. It will be proved that the solutions are unique and that, contrary to statements that have appeared in the literature, the least-squares adjustment problem is not ill-posed. No modification is required to the adjustment formulae that have been used in the past in the case of a singular covariance matrix for the priors. In conclusion: The linear least-squares adjustment formula that has been used in the past is valid in the case of a singular covariance matrix for the covariance matrix of prior parameters. Furthermore, it provides a unique solution. Statements in the literature, to the effect that the problem is ill-posed are wrong. No regularization of the problem is required. This has been proved in the present paper by two methods, while explicitly assuming that the covariance matrix of prior parameters is singular: i) extension of standard results for the linear regression formulae, and (ii) minimization by differentiation of a quadratic form of the deviations in parameters and responses. No modification is needed to the adjustment formulae that have been used in the past. (author)
ERIC Educational Resources Information Center
Steiner, Peter M.; Cook, Thomas D.; Shadish, William R.
2011-01-01
The effect of unreliability of measurement on propensity score (PS) adjusted treatment effects has not been previously studied. The authors report on a study simulating different degrees of unreliability in the multiple covariates that were used to estimate the PS. The simulation uses the same data as two prior studies. Shadish, Clark, and Steiner…
Taking correlations in GPS least squares adjustments into account with a diagonal covariance matrix
NASA Astrophysics Data System (ADS)
Kermarrec, Gaël; Schön, Steffen
2016-09-01
Based on the results of Luati and Proietti (Ann Inst Stat Math 63:673-686, 2011) on an equivalence for a certain class of polynomial regressions between the diagonally weighted least squares (DWLS) and the generalized least squares (GLS) estimator, an alternative way to take correlations into account thanks to a diagonal covariance matrix is presented. The equivalent covariance matrix is much easier to compute than a diagonalization of the covariance matrix via eigenvalue decomposition which also implies a change of the least squares equations. This condensed matrix, for use in the least squares adjustment, can be seen as a diagonal or reduced version of the original matrix, its elements being simply the sums of the rows elements of the weighting matrix. The least squares results obtained with the equivalent diagonal matrices and those given by the fully populated covariance matrix are mathematically strictly equivalent for the mean estimator in terms of estimate and its a priori cofactor matrix. It is shown that this equivalence can be empirically extended to further classes of design matrices such as those used in GPS positioning (single point positioning, precise point positioning or relative positioning with double differences). Applying this new model to simulated time series of correlated observations, a significant reduction of the coordinate differences compared with the solutions computed with the commonly used diagonal elevation-dependent model was reached for the GPS relative positioning with double differences, single point positioning as well as precise point positioning cases. The estimate differences between the equivalent and classical model with fully populated covariance matrix were below the mm for all simulated GPS cases and below the sub-mm for the relative positioning with double differences. These results were confirmed by analyzing real data. Consequently, the equivalent diagonal covariance matrices, compared with the often used elevation
A resampling approach for adjustment in prediction models for covariate measurement error.
Li, Wei; Mazumdar, Sati; Arena, Vincent C; Sussman, Nancy
2005-03-01
Recent works on covariate measurement errors focus on the possible biases in model coefficient estimates. Usually, measurement error in a covariate tends to attenuate the coefficient estimate for the covariate, i.e., a bias toward the null occurs. Measurement error in another confounding or interacting variable typically results in incomplete adjustment for that variable. Hence, the coefficient for the covariate of interest may be biased either toward or away from the null. This paper presents a new method based on a resampling technique to deal with covariate measurement errors in the context of prediction modeling. Prediction accuracy is our primary parameter of interest. Prediction accuracy of a model is defined as the success rate of prediction when the model predicts new response. We call our method bootstrap regression calibration (BRC). We study logistic regression with interacting covariates as our prediction model. We measure the prediction accuracy of a model by receiver operating characteristic (ROC) method. Results from simulations show that bootstrap regression calibration offers consistent enhancement over the commonly used regression calibration (RC) method in terms of improving prediction accuracy of the model and reducing bias in the estimated coefficients.
Haem, Elham; Harling, Kajsa; Ayatollahi, Seyyed Mohammad Taghi; Zare, Najaf; Karlsson, Mats O
2017-02-01
One important aim in population pharmacokinetics (PK) and pharmacodynamics is identification and quantification of the relationships between the parameters and covariates. Lasso has been suggested as a technique for simultaneous estimation and covariate selection. In linear regression, it has been shown that Lasso possesses no oracle properties, which means it asymptotically performs as though the true underlying model was given in advance. Adaptive Lasso (ALasso) with appropriate initial weights is claimed to possess oracle properties; however, it can lead to poor predictive performance when there is multicollinearity between covariates. This simulation study implemented a new version of ALasso, called adjusted ALasso (AALasso), to take into account the ratio of the standard error of the maximum likelihood (ML) estimator to the ML coefficient as the initial weight in ALasso to deal with multicollinearity in non-linear mixed-effect models. The performance of AALasso was compared with that of ALasso and Lasso. PK data was simulated in four set-ups from a one-compartment bolus input model. Covariates were created by sampling from a multivariate standard normal distribution with no, low (0.2), moderate (0.5) or high (0.7) correlation. The true covariates influenced only clearance at different magnitudes. AALasso, ALasso and Lasso were compared in terms of mean absolute prediction error and error of the estimated covariate coefficient. The results show that AALasso performed better in small data sets, even in those in which a high correlation existed between covariates. This makes AALasso a promising method for covariate selection in nonlinear mixed-effect models.
Nonparametric Streamflow Disaggregation Model
NASA Astrophysics Data System (ADS)
Lee, T.; Salas, J. D.; Prairie, J. R.
2009-05-01
Stochastic streamflow generation is generally utilized for planning and management of water resources systems. For this purpose a number of parametric and nonparametric modeling alternatives have been suggested in literature. Among them temporal and spatial disaggregation approaches play an important role particularly to make sure that historical variance-covariance properties are preserved at various temporal and spatial scales. In this paper, we review the underlying features of nonparametric disaggregation, identify some of their pros and cons, and propose a disaggregation algorithm that is capable of surmounting some of the shortcoming of the current models. The proposed models hinge on k-nearest neighbor resampling, the accurate adjusting procedure, and a genetic algorithm. The model has been tested and compared to an existing nonparametric disaggregation approach using data of the Colorado River system. It has been shown that the model is capable of (i) reproducing the season-to-season correlations including the correlation between the last season of the previous year and the first season of the current year, (ii) minimizing or avoiding the generation of flow patterns across the year that are literally the same as those of the historical records, and (iii) minimizing or avoiding the generation of negative flows. In addition, it is applicable to intermittent river regimes. Suggestions for further improving the model are discussed.
Asymptotically Normal and Efficient Estimation of Covariate-Adjusted Gaussian Graphical Model
Chen, Mengjie; Ren, Zhao; Zhao, Hongyu; Zhou, Harrison
2015-01-01
A tuning-free procedure is proposed to estimate the covariate-adjusted Gaussian graphical model. For each finite subgraph, this estimator is asymptotically normal and efficient. As a consequence, a confidence interval can be obtained for each edge. The procedure enjoys easy implementation and efficient computation through parallel estimation on subgraphs or edges. We further apply the asymptotic normality result to perform support recovery through edge-wise adaptive thresholding. This support recovery procedure is called ANTAC, standing for Asymptotically Normal estimation with Thresholding after Adjusting Covariates. ANTAC outperforms other methodologies in the literature in a range of simulation studies. We apply ANTAC to identify gene-gene interactions using an eQTL dataset. Our result achieves better interpretability and accuracy in comparison with CAMPE. PMID:27499564
Spatial and temporal patterns of enzootic raccoon rabies adjusted for multiple covariates.
Recuenco, Sergio; Eidson, Millicent; Kulldorff, Martin; Johnson, Glen; Cherry, Bryan
2007-04-11
With the objective of identifying spatial and temporal patterns of enzootic raccoon variant rabies, a spatial scan statistic was utilized to search for significant terrestrial rabies clusters by year in New York State in 1997-2003. Cluster analyses were unadjusted for other factors, adjusted for covariates, and adjusted for covariates and large scale geographic variation (LSGV). Adjustments were intended to identify the unusual aggregations of cases given the expected distribution based on the observed locations. Statistically significant clusters were identified particularly in the Albany, Finger Lakes, and South Hudson areas. The clusters were generally persistent in the Albany area, but demonstrated cyclical changes in rabies activity every few years in the other areas. Cluster adjustments allowed the discussion of possible causes for the high risk raccoon rabies areas identified. This study analyzed raccoon variant rabies spatial and temporal patterns in New York that have not been previously described at a focal (census tract) level. Comparisons across the type of spatial analysis performed with various degrees of adjustment allow consideration of the potential influence of geographical factors for raccoon rabies and possible reasons for the highest risk areas (statistically significant clusters).
Spatial and temporal patterns of enzootic raccoon rabies adjusted for multiple covariates
Recuenco, Sergio; Eidson, Millicent; Kulldorff, Martin; Johnson, Glen; Cherry, Bryan
2007-01-01
Background With the objective of identifying spatial and temporal patterns of enzootic raccoon variant rabies, a spatial scan statistic was utilized to search for significant terrestrial rabies clusters by year in New York State in 1997–2003. Cluster analyses were unadjusted for other factors, adjusted for covariates, and adjusted for covariates and large scale geographic variation (LSGV). Adjustments were intended to identify the unusual aggregations of cases given the expected distribution based on the observed locations. Results Statistically significant clusters were identified particularly in the Albany, Finger Lakes, and South Hudson areas. The clusters were generally persistent in the Albany area, but demonstrated cyclical changes in rabies activity every few years in the other areas. Cluster adjustments allowed the discussion of possible causes for the high risk raccoon rabies areas identified. Conclusion This study analyzed raccoon variant rabies spatial and temporal patterns in New York that have not been previously described at a focal (census tract) level. Comparisons across the type of spatial analysis performed with various degrees of adjustment allow consideration of the potential influence of geographical factors for raccoon rabies and possible reasons for the highest risk areas (statistically significant clusters). PMID:17428324
Shi, Ran
2016-01-01
Human brains perform tasks via complex functional networks consisting of separated brain regions. A popular approach to characterize brain functional networks in fMRI studies is independent component analysis (ICA), which is a powerful method to reconstruct latent source signals from their linear mixtures. In many fMRI studies, an important goal is to investigate how brain functional networks change according to specific clinical and demographic variabilities. Existing ICA methods, however, cannot directly incorporate covariate effects in ICA decomposition. Heuristic post-ICA analysis to address this need can be inaccurate and inefficient. In this paper, we propose a hierarchical covariate-adjusted ICA (hc-ICA) model that provides a formal statistical framework for estimating covariate effects and testing differences between brain functional networks. Our method provides a more reliable and powerful statistical tool for evaluating group differences in brain functional networks while appropriately controlling for potential confounding factors. We present an analytically tractable EM algorithm to obtain maximum likelihood estimates of our model. We also develop a subspace-based approximate EM that runs significantly faster while retaining high accuracy. To test the differences in functional networks, we introduce a voxel-wise approximate inference procedure which eliminates the need of computationally expensive covariance matrix estimation and inversion. We demonstrate the advantages of our methods over the existing method via simulation studies. We apply our method to an fMRI study to investigate differences in brain functional networks associated with post-traumatic stress disorder (PTSD).
Alton, Gillian D; Pearl, David L; Bateman, Ken G; McNab, Bruce; Berke, Olaf
2013-11-18
Abattoir condemnation data show promise as a rich source of data for syndromic surveillance of both animal and zoonotic diseases. However, inherent characteristics of abattoir condemnation data can bias results from space-time cluster detection methods for disease surveillance, and may need to be accounted for using various adjustment methods. The objective of this study was to compare the space-time scan statistics with different abilities to control for covariates and to assess their suitability for food animal syndromic surveillance. Four space-time scan statistic models were used including: animal class adjusted Poisson, space-time permutation, multi-level model adjusted Poisson, and a weighted normal scan statistic using model residuals. The scan statistics were applied to monthly bovine pneumonic lung and "parasitic liver" condemnation data from Ontario provincial abattoirs from 2001-2007. The number and space-time characteristics of identified clusters often varied between space-time scan tests for both "parasitic liver" and pneumonic lung condemnation data. While there were some similarities between isolated clusters in space, time and/or space-time, overall the results from space-time scan statistics differed substantially depending on the covariate adjustment approach used. Variability in results among methods suggests that caution should be used in selecting space-time scan methods for abattoir surveillance. Furthermore, validation of different approaches with simulated or real outbreaks is required before conclusive decisions can be made concerning the best approach for conducting surveillance with these data.
Covariate-Adjusted Precision Matrix Estimation with an Application in Genetical Genomics
Cai, T. Tony; Li, Hongzhe; Liu, Weidong; Xie, Jichun
2017-01-01
Summary Motivated by analysis of genetical genomics data, we introduce a sparse high dimensional multivariate regression model for studying conditional independence relationships among a set of genes adjusting for possible genetic effects. The precision matrix in the model specifies a covariate-adjusted Gaussian graph, which presents the conditional dependence structure of gene expression after the confounding genetic effects on gene expression are taken into account. We present a covariate-adjusted precision matrix estimation method using a constrained ℓ1 minimization, which can be easily implemented by linear programming. Asymptotic convergence rates in various matrix norms and sign consistency are established for the estimators of the regression coefficients and the precision matrix, allowing both the number of genes and the number of the genetic variants to diverge. Simulation shows that the proposed method results in significant improvements in both precision matrix estimation and graphical structure selection when compared to the standard Gaussian graphical model assuming constant means. The proposed method is also applied to analyze a yeast genetical genomics data for the identification of the gene network among a set of genes in the mitogen-activated protein kinase pathway.
HE, PENG; ERIKSSON, FRANK; SCHEIKE, THOMAS H.; ZHANG, MEI-JIE
2015-01-01
With competing risks data, one often needs to assess the treatment and covariate effects on the cumulative incidence function. Fine and Gray proposed a proportional hazards regression model for the subdistribution of a competing risk with the assumption that the censoring distribution and the covariates are independent. Covariate-dependent censoring sometimes occurs in medical studies. In this paper, we study the proportional hazards regression model for the subdistribution of a competing risk with proper adjustments for covariate-dependent censoring. We consider a covariate-adjusted weight function by fitting the Cox model for the censoring distribution and using the predictive probability for each individual. Our simulation study shows that the covariate-adjusted weight estimator is basically unbiased when the censoring time depends on the covariates, and the covariate-adjusted weight approach works well for the variance estimator as well. We illustrate our methods with bone marrow transplant data from the Center for International Blood and Marrow Transplant Research (CIBMTR). Here cancer relapse and death in complete remission are two competing risks. PMID:27034534
A nonparametric stochastic method for generating daily climate-adjusted streamflows
NASA Astrophysics Data System (ADS)
Stagge, J. H.; Moglen, G. E.
2013-10-01
A daily stochastic streamflow generation model is presented, which successfully replicates statistics of the historical streamflow record and can produce climate-adjusted daily time series. A monthly climate model relates general circulation model (GCM)-scale climate indicators to discrete climate-streamflow states, which in turn control parameters in a daily streamflow generation model. Daily flow is generated by a two-state (increasing/decreasing) Markov chain, with rising limb increments randomly sampled from a Weibull distribution and the falling limb modeled as exponential recession. When applied to the Potomac River, a 38,000 km2 basin in the Mid-Atlantic United States, the model reproduces the daily, monthly, and annual distribution and dynamics of the historical streamflow record, including extreme low flows. This method can be used as part of water resources planning, vulnerability, and adaptation studies and offers the advantage of a parsimonious model, requiring only a sufficiently long historical streamflow record and large-scale climate data. Simulation of Potomac streamflows subject to the Special Report on Emissions Scenarios (SRES) A1b, A2, and B1 emission scenarios predict a slight increase in mean annual flows over the next century, with the majority of this increase occurring during the winter and early spring. Conversely, mean summer flows are projected to decrease due to climate change, caused by a shift to shorter, more sporadic rain events. Date of the minimum annual flow is projected to shift 2-5 days earlier by the 2070-2099 period.
Biswas, Atanu; Park, Eunsik; Bhattacharya, Rahul
2012-08-01
Response-adaptive designs have become popular for allocation of the entering patients among two or more competing treatments in a phase III clinical trial. Although there are a lot of designs for binary treatment responses, the number of designs involving covariates is very small. Sometimes the patients give repeated responses. The only available response-adaptive allocation design for repeated binary responses is the urn design by Biswas and Dewanji [Biswas A and Dewanji AA. Randomized longitudinal play-the-winner design for repeated binary data. ANZJS 2004; 46: 675-684; Biswas A and Dewanji A. Inference for a RPW-type clinical trial with repeated monitoring for the treatment of rheumatoid arthritis. Biometr J 2004; 46: 769-779.], although it does not take care of the covariates of the patients in the allocation design. In this article, a covariate-adjusted response-adaptive randomisation procedure is developed using the log-odds ratio within the Bayesian framework for longitudinal binary responses. The small sample performance of the proposed allocation procedure is assessed through a simulation study. The proposed procedure is illustrated using some real data set.
2013-01-01
Background Abattoir condemnation data show promise as a rich source of data for syndromic surveillance of both animal and zoonotic diseases. However, inherent characteristics of abattoir condemnation data can bias results from space-time cluster detection methods for disease surveillance, and may need to be accounted for using various adjustment methods. The objective of this study was to compare the space-time scan statistics with different abilities to control for covariates and to assess their suitability for food animal syndromic surveillance. Four space-time scan statistic models were used including: animal class adjusted Poisson, space-time permutation, multi-level model adjusted Poisson, and a weighted normal scan statistic using model residuals. The scan statistics were applied to monthly bovine pneumonic lung and “parasitic liver” condemnation data from Ontario provincial abattoirs from 2001–2007. Results The number and space-time characteristics of identified clusters often varied between space-time scan tests for both “parasitic liver” and pneumonic lung condemnation data. While there were some similarities between isolated clusters in space, time and/or space-time, overall the results from space-time scan statistics differed substantially depending on the covariate adjustment approach used. Conclusions Variability in results among methods suggests that caution should be used in selecting space-time scan methods for abattoir surveillance. Furthermore, validation of different approaches with simulated or real outbreaks is required before conclusive decisions can be made concerning the best approach for conducting surveillance with these data. PMID:24246040
Adjusting for covariates on a slippery slope: linkage analysis of change over time
Rampersaud, Evadnie; Allen, Andrew; Li, Yi-Ju; Shao, Yujun; Bass, Meredyth; Haynes, Carol; Ashley-Koch, Allison; Martin, Eden R; Schmidt, Silke; Hauser, Elizabeth R
2003-01-01
Background We analyzed the Genetic Analysis Workshop 13 (GAW13) simulated data to contrast and compare different methods for the genetic linkage analysis of hypertension and change in blood pressure over time. We also examined methods for incorporating covariates into the linkage analysis. We used methods for quantitative trait loci (QTL) linkage analysis with and without covariates and affected sib-pair (ASP) analysis of hypertension followed by ordered subset analysis (OSA), using variables associated with change in blood pressure over time. Results Four of the five baseline genes and one of the three slope genes were not detected by any method using conventional criteria. OSA detected baseline gene b35 on chromosome 13 when using the slope in blood pressure to adjust for change over time. Slope gene s10 was detected by the ASP analysis and slope gene s11 was detected by QTL linkage analysis as well as by OSA analysis. Analysis of null chromosomes, i.e., chromosomes without genes, did not reveal significant increases in type I error. However, there were a number of genes indirectly related to blood pressure detected by a variety of methods. Conclusions We noted that there is no obvious first choice of analysis software for analyzing a complicated model, such as the one underlying the GAW13 simulated data. Inclusion of covariates and longitudinal data can improve localization of genes for complex traits but it is not always clear how best to do this. It remains a worthwhile task to apply several different approaches since one method is not always the best. PMID:14975118
ERIC Educational Resources Information Center
Forster, Fred
Statistical methods are described for diagnosing and treating three important problems in covariate tests of significance: curvilinearity, covariable effectiveness, and treatment-covariable interaction. Six major assumptions, prerequisites for covariate procedure, are discussed in detail: (1) normal distribution, (2) homogeneity of variances, (3)…
A note on the empirical likelihood confidence band for hazards ratio with covariate adjustment.
Zhu, Shihong; Yang, Yifan; Zhou, Mai
2015-09-01
In medical studies comparing two treatments in the presence of censored data, the stratified Cox model is an important tool that has the ability to flexibly handle non-proportional hazards while allowing parsimonious covariate adjustment. In order to capture the cumulative treatment effect, the ratio of the treatment specific cumulative baseline hazards is often used as a measure of the treatment effect. Pointwise and simultaneous confidence bands associated with the estimated ratio provide a global picture of how the treatment effect evolves over time. Recently, Dong and Matthews (2012, Biometrics 68, 408-418) proposed to construct a pointwise confidence interval for the ratio using a plug-in type empirical likelihood approach. However, their result on the limiting distribution of the empirical likelihood ratio is generally incorrect and the resulting confidence interval is asymptotically undercovering. In this article, we derive the correct limiting distribution for the likelihood ratio statistic. We also present simulation studies to demonstrate the effectiveness of our approach.
A covariate-adjustment regression model approach to noninferiority margin definition.
Nie, Lei; Soon, Guoxing
2010-05-10
To maintain the interpretability of the effect of experimental treatment (EXP) obtained from a noninferiority trial, current statistical approaches often require the constancy assumption. This assumption typically requires that the control treatment effect in the population of the active control trial is the same as its effect presented in the population of the historical trial. To prevent constancy assumption violation, clinical trial sponsors were recommended to make sure that the design of the active control trial is as close to the design of the historical trial as possible. However, these rigorous requirements are rarely fulfilled in practice. The inevitable discrepancies between the historical trial and the active control trial have led to debates on many controversial issues. Without support from a well-developed quantitative method to determine the impact of the discrepancies on the constancy assumption violation, a correct judgment seems difficult. In this paper, we present a covariate-adjustment generalized linear regression model approach to achieve two goals: (1) to quantify the impact of population difference between the historical trial and the active control trial on the degree of constancy assumption violation and (2) to redefine the active control treatment effect in the active control trial population if the quantification suggests an unacceptable violation. Through achieving goal (1), we examine whether or not a population difference leads to an unacceptable violation. Through achieving goal (2), we redefine the noninferiority margin if the violation is unacceptable. This approach allows us to correctly determine the effect of EXP in the noninferiority trial population when constancy assumption is violated due to the population difference. We illustrate the covariate-adjustment approach through a case study.
Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W.; Müller, Klaus-Robert; Lemm, Steven
2013-01-01
Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016
Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W; Müller, Klaus-Robert; Lemm, Steven
2013-01-01
Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation.
Song, Rui; Kosorok, Michael R; Cai, Jianwen
2008-09-01
Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847-862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499-503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study.
Song, Rui; Kosorok, Michael R.; Cai, Jianwen
2009-01-01
Summary Recurrent events data are frequently encountered in clinical trials. This article develops robust covariate-adjusted log-rank statistics applied to recurrent events data with arbitrary numbers of events under independent censoring and the corresponding sample size formula. The proposed log-rank tests are robust with respect to different data-generating processes and are adjusted for predictive covariates. It reduces to the Kong and Slud (1997, Biometrika 84, 847–862) setting in the case of a single event. The sample size formula is derived based on the asymptotic normality of the covariate-adjusted log-rank statistics under certain local alternatives and a working model for baseline covariates in the recurrent event data context. When the effect size is small and the baseline covariates do not contain significant information about event times, it reduces to the same form as that of Schoenfeld (1983, Biometrics 39, 499–503) for cases of a single event or independent event times within a subject. We carry out simulations to study the control of type I error and the comparison of powers between several methods in finite samples. The proposed sample size formula is illustrated using data from an rhDNase study. PMID:18162107
ERIC Educational Resources Information Center
Petscher, Yaacov; Schatschneider, Christopher
2011-01-01
Research by Huck and McLean (1975) demonstrated that the covariance-adjusted score is more powerful than the simple difference score, yet recent reviews indicate researchers are equally likely to use either score type in two-wave randomized experimental designs. A Monte Carlo simulation was conducted to examine the conditions under which the…
Use and Impact of Covariance Data in the Japanese Latest Adjusted Library ADJ2010 Based on JENDL-4.0
NASA Astrophysics Data System (ADS)
Yokoyama, K.; Ishikawa, M.
2015-01-01
The current status of covariance applications to fast reactor analysis and design in Japan is summarized. In Japan, the covariance data are mainly used for three purposes: (1) to quantify the uncertainty of nuclear core parameters, (2) to identify important nuclides, reactions and energy ranges which are dominant to the uncertainty of core parameters, and (3) to improve the accuracy of core design values by adopting the integral data such as the critical experiments and the power reactor operation data. For the last purpose, the cross section adjustment based on the Bayesian theorem is used. After the release of JENDL-4.0, a development project of the new adjusted group-constant set ADJ2010 was started in 2010 and completed in 2013. In the present paper, the final results of ADJ2010 are briefly summarized. In addition, the adjustment results of ADJ2010 are discussed from the viewpoint of use and impact of nuclear data covariances, focusing on 239Pu capture cross section alterations. For this purpose three kind of indices, called "degree of mobility," "adjustment motive force," and "adjustment potential," are proposed.
Use and Impact of Covariance Data in the Japanese Latest Adjusted Library ADJ2010 Based on JENDL-4.0
Yokoyama, K. Ishikawa, M.
2015-01-15
The current status of covariance applications to fast reactor analysis and design in Japan is summarized. In Japan, the covariance data are mainly used for three purposes: (1) to quantify the uncertainty of nuclear core parameters, (2) to identify important nuclides, reactions and energy ranges which are dominant to the uncertainty of core parameters, and (3) to improve the accuracy of core design values by adopting the integral data such as the critical experiments and the power reactor operation data. For the last purpose, the cross section adjustment based on the Bayesian theorem is used. After the release of JENDL-4.0, a development project of the new adjusted group-constant set ADJ2010 was started in 2010 and completed in 2013. In the present paper, the final results of ADJ2010 are briefly summarized. In addition, the adjustment results of ADJ2010 are discussed from the viewpoint of use and impact of nuclear data covariances, focusing on {sup 239}Pu capture cross section alterations. For this purpose three kind of indices, called “degree of mobility,” “adjustment motive force,” and “adjustment potential,” are proposed.
2010-01-01
Objectives To evaluate the use and reporting of adjusted analysis in randomised controlled trials (RCTs) and compare the quality of reporting before and after the revision of the CONSORT Statement in 2001. Design Comparison of two cross sectional samples of published articles. Data Sources Journal articles indexed on PubMed in December 2000 and December 2006. Study Selection Parallel group RCTs with a full publication carried out in humans and published in English Main outcome measures Proportion of articles reported adjusted analysis; use of adjusted analysis; the reason for adjustment; the method of adjustment and the reporting of adjusted analysis results in the main text and abstract. Results In both cohorts, 25% of studies reported adjusted analysis (84/355 in 2000 vs 113/422 in 2006). Compared with articles reporting only unadjusted analyses, articles that reported adjusted analyses were more likely to specify primary outcomes, involve multiple centers, perform stratified randomization, be published in general medical journals, and recruit larger sample sizes. In both years a minority of articles explained why and how covariates were selected for adjustment (20% to 30%). Almost all articles specified the statistical methods used for adjustment (99% in 2000 vs 100% in 2006) but only 5% and 10%, respectively, reported both adjusted and unadjusted results as recommended in the CONSORT guidelines. Conclusion There was no evidence of change in the reporting of adjusted analysis results five years after the revision of the CONSORT Statement and only a few articles adhered fully to the CONSORT recommendations. PMID:20482769
Yu, Ly-Mee; Chan, An-Wen; Hopewell, Sally; Deeks, Jonathan J; Altman, Douglas G
2010-05-18
To evaluate the use and reporting of adjusted analysis in randomised controlled trials (RCTs) and compare the quality of reporting before and after the revision of the CONSORT Statement in 2001. Comparison of two cross sectional samples of published articles. Journal articles indexed on PubMed in December 2000 and December 2006. Parallel group RCTs with a full publication carried out in humans and published in English Proportion of articles reported adjusted analysis; use of adjusted analysis; the reason for adjustment; the method of adjustment and the reporting of adjusted analysis results in the main text and abstract. In both cohorts, 25% of studies reported adjusted analysis (84/355 in 2000 vs 113/422 in 2006). Compared with articles reporting only unadjusted analyses, articles that reported adjusted analyses were more likely to specify primary outcomes, involve multiple centers, perform stratified randomization, be published in general medical journals, and recruit larger sample sizes. In both years a minority of articles explained why and how covariates were selected for adjustment (20% to 30%). Almost all articles specified the statistical methods used for adjustment (99% in 2000 vs 100% in 2006) but only 5% and 10%, respectively, reported both adjusted and unadjusted results as recommended in the CONSORT guidelines. There was no evidence of change in the reporting of adjusted analysis results five years after the revision of the CONSORT Statement and only a few articles adhered fully to the CONSORT recommendations.
Shan, Na; Xu, Ping-Feng
2016-11-01
In randomized trials with noncompliance, causal effects cannot be identified without strong assumptions. Therefore, several authors have considered bounds on the causal effects. Applying an idea of VanderWeele (), Chiba () gave bounds on the average causal effects in randomized trials with noncompliance using the information on the randomized assignment, the treatment received and the outcome under monotonicity assumptions about covariates. But he did not consider any observed covariates. If there are some observed covariates such as age, gender, and race in a trial, we propose new bounds using the observed covariate information under some monotonicity assumptions similar to those of VanderWeele and Chiba. And we compare the three bounds in a real example. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Palmiotti, Giuseppe; Salvatores, Massimo; Aliberti, G.
2015-01-01
In order to provide useful feedback to evaluators a set of criteria are established for assessing the robustness and reliability of the cross section adjustments that make use of integral experiment information. Criteria are also provided for accepting the “a posteriori” cross sections, both as new “nominal” values and as “trends”. Some indications of the use of the “a posteriori” covariance matrix are indicated, even though more investigation is needed to settle this complex subject.
Chang, Shu-Hui; Tzeng, Shinn-Jia
2006-03-01
In follow-up studies, survival data often include subjects who have had a certain event at recruitment and may potentially experience a series of subsequent events during the follow-up period. This kind of survival data collected under a cross-sectional sampling criterion is called truncated serial event data. The outcome variables of interest in this paper are serial sojourn times between successive events. To analyze the sojourn times in truncated serial event data, we need to confront two potential sampling biases arising simultaneously from a sampling criterion and induced informative censoring. In this study, nonparametric estimation of the joint probability function of serial sojourn times is developed by using inverse probabilities of the truncation and censoring times as weight functions to accommodate these two sampling biases under various situations of truncation and censoring. Relevant statistical properties of the proposed estimators are also discussed. Simulation studies and two real data are presented to illustrate the proposed methods.
ERIC Educational Resources Information Center
Furtwengler, Scott R.
2015-01-01
The present study sought to determine the extent to which participation in a post-secondary honors program affected academic achievement. Archival data were collected on three cohorts of high-achieving students at a large public university. Propensity scores were calculated on factors predicting participation in honors and used as the covariate.…
ERIC Educational Resources Information Center
Steiner, Peter M.; Cook, Thomas D.; Shadish, William R.
2009-01-01
This paper investigates how bias reduction was affected when different degrees of measurement error were systematically introduced into the measures constituting the final estimated propensity score (PS), the PS only for the set of effective covariates and the PS only for the ineffective ones. Since there was already some error in the Shadish et…
ERIC Educational Resources Information Center
Furtwengler, Scott R.
2015-01-01
The present study sought to determine the extent to which participation in a post-secondary honors program affected academic achievement. Archival data were collected on three cohorts of high-achieving students at a large public university. Propensity scores were calculated on factors predicting participation in honors and used as the covariate.…
Doubly Robust-Type Estimation for Covariate Adjustment in Latent Variable Modeling
ERIC Educational Resources Information Center
Hoshino, Takahiro
2007-01-01
Due to the difficulty in achieving a random assignment, a quasi-experimental or observational study design is frequently used in the behavioral and social sciences. If a nonrandom assignment depends on the covariates, multiple group structural equation modeling, that includes the regression function of the dependent variables on the covariates…
Chaste, Pauline; Klei, Lambertus; Sanders, Stephan J.; Murtha, Michael T.; Hus, Vanessa; Lowe, Jennifer K.; Willsey, A. Jeremy; Moreno-De-Luca, Daniel; Yu, Timothy W.; Fombonne, Eric; Geschwind, Daniel; Grice, Dorothy E.; Ledbetter, David H.; Lord, Catherine; Mane, Shrikant M.; Martin, Christa Lese; Martin, Donna M.; Morrow, Eric M.; Walsh, Christopher A.; Sutcliffe, James S.; State, Matthew W.; Devlin, Bernie; Cook, Edwin H.; Kim, Soo-Jeong
2013-01-01
BACKGROUND Brain development follows a different trajectory in children with Autism Spectrum Disorders (ASD) than in typically developing children. A proxy for neurodevelopment could be head circumference (HC), but studies assessing HC and its clinical correlates in ASD have been inconsistent. This study investigates HC and clinical correlates in the Simons Simplex Collection cohort. METHODS We used a mixed linear model to estimate effects of covariates and the deviation from the expected HC given parental HC (genetic deviation). After excluding individuals with incomplete data, 7225 individuals in 1891 families remained for analysis. We examined the relationship between HC/genetic deviation of HC and clinical parameters. RESULTS Gender, age, height, weight, genetic ancestry and ASD status were significant predictors of HC (estimate of the ASD effect=0.2cm). HC was approximately normally distributed in probands and unaffected relatives, with only a few outliers. Genetic deviation of HC was also normally distributed, consistent with a random sampling of parental genes. Whereas larger HC than expected was associated with ASD symptom severity and regression, IQ decreased with the absolute value of the genetic deviation of HC. CONCLUSIONS Measured against expected values derived from covariates of ASD subjects, statistical outliers for HC were uncommon. HC is a strongly heritable trait and population norms for HC would be far more accurate if covariates including genetic ancestry, height and age were taken into account. The association of diminishing IQ with absolute deviation from predicted HC values suggests HC could reflect subtle underlying brain development and warrants further investigation. PMID:23746936
Huo, Yuankai; Aboud, Katherine; Kang, Hakmook; Cutting, Laurie E.; Landman, Bennett A.
2016-01-01
Understanding brain volumetry is essential to understand neurodevelopment and disease. Historically, age-related changes have been studied in detail for specific age ranges (e.g., early childhood, teen, young adults, elderly, etc.) or more sparsely sampled for wider considerations of lifetime aging. Recent advancements in data sharing and robust processing have made available considerable quantities of brain images from normal, healthy volunteers. However, existing analysis approaches have had difficulty addressing (1) complex volumetric developments on the large cohort across the life time (e.g., beyond cubic age trends), (2) accounting for confound effects, and (3) maintaining an analysis framework consistent with the general linear model (GLM) approach pervasive in neuroscience. To address these challenges, we propose to use covariate-adjusted restricted cubic spline (C-RCS) regression within a multi-site cross-sectional framework. This model allows for flexible consideration of non-linear age-associated patterns while accounting for traditional covariates and interaction effects. As a demonstration of this approach on lifetime brain aging, we derive normative volumetric trajectories and 95% confidence intervals from 5111 healthy patients from 64 sites while accounting for confounding sex, intracranial volume and field strength effects. The volumetric results are shown to be consistent with traditional studies that have explored more limited age ranges using single-site analyses. This work represents the first integration of C-RCS with neuroimaging and the derivation of structural covariance networks (SCNs) from a large study of multi-site, cross-sectional data. PMID:28191550
Huo, Yuankai; Aboud, Katherine; Kang, Hakmook; Cutting, Laurie E; Landman, Bennett A
2016-10-01
Understanding brain volumetry is essential to understand neurodevelopment and disease. Historically, age-related changes have been studied in detail for specific age ranges (e.g., early childhood, teen, young adults, elderly, etc.) or more sparsely sampled for wider considerations of lifetime aging. Recent advancements in data sharing and robust processing have made available considerable quantities of brain images from normal, healthy volunteers. However, existing analysis approaches have had difficulty addressing (1) complex volumetric developments on the large cohort across the life time (e.g., beyond cubic age trends), (2) accounting for confound effects, and (3) maintaining an analysis framework consistent with the general linear model (GLM) approach pervasive in neuroscience. To address these challenges, we propose to use covariate-adjusted restricted cubic spline (C-RCS) regression within a multi-site cross-sectional framework. This model allows for flexible consideration of non-linear age-associated patterns while accounting for traditional covariates and interaction effects. As a demonstration of this approach on lifetime brain aging, we derive normative volumetric trajectories and 95% confidence intervals from 5111 healthy patients from 64 sites while accounting for confounding sex, intracranial volume and field strength effects. The volumetric results are shown to be consistent with traditional studies that have explored more limited age ranges using single-site analyses. This work represents the first integration of C-RCS with neuroimaging and the derivation of structural covariance networks (SCNs) from a large study of multi-site, cross-sectional data.
Disability as a covariate in risk adjustment models for predicting hospital deaths.
Iezzoni, Lisa I
2014-01-01
Risk-adjusted hospital mortality rates are frequently used as putative indicators of hospital quality. These figures could become increasingly important as efforts escalate to contain U.S. health care costs while simultaneously maintaining or improving quality of care. Most risk adjustment methods today employ coded diagnostic information sometimes supplemented with more detailed clinical data obtained from medical records. This article considers whether risk-adjusted hospital mortality rates should account for baseline patient disability. Accounting for baseline disability when calculating hospital mortality rates makes clinical sense, especially for conditions such as heart failure or coronary artery bypass grafting surgery, where patients' cardiac-related functional status strongly predicts their imminent outcomes. A small body of research suggests the strength of disability in predicting hospital mortality, even in comparison with indicators of acute physiologic status and comorbid illness. However, the feasibility of obtaining complete and accurate data on patients' baseline disability will be challenging and requires further investigation. The risk of not adjusting for baseline disability could be efforts by physicians and hospitals to avoid treating patients with significant disabilities. Copyright © 2014 Elsevier Inc. All rights reserved.
Adjusting for population shifts and covariates in space-time interaction tests.
Schmertmann, Carl P
2015-09-01
Statistical tests for epidemic patterns use measures of space-time event clustering, and look for high levels of clustering that are unlikely to appear randomly if events are independent. Standard approaches, such as Knox's (1964, Applied Statistics 13, 25-29) test, are biased when the spatial distribution of population changes over time, or when there is space-time interaction in important background variables. In particular, the Knox test is too sensitive to coincidental event clusters in such circumstances, and too likely to raise false alarms. Kulldorff and Hjalmars (1999, Biometrics 55, 544-552) proposed a variant of Knox's test to control for bias caused by population shifts. In this article, I demonstrate that their test is also generally biased, in an unknown direction. I suggest an alternative approach that accounts for exposure shifts while also conditioning on the observed spatial and temporal margins of event counts, as in the original Knox test. The new approach uses Metropolis sampling of permutations, and is unbiased under more general conditions. I demonstrate how the new method can also include controls for the clustering effects of covariates. © 2015, The International Biometric Society.
Covariate-adjusted borrowing of historical control data in randomized clinical trials.
Han, Baoguang; Zhan, Jia; John Zhong, Z; Liu, Dawei; Lindborg, Stacy
2017-07-01
The borrowing of historical control data can be an efficient way to improve the treatment effect estimate of the current control group in a randomized clinical trial. When the historical and current control data are consistent, the borrowing of historical data can increase power and reduce Type I error rate. However, when these 2 sources of data are inconsistent, it may result in a combination of biased estimates, reduced power, and inflation of Type I error rate. In some situations, inconsistency between historical and current control data may be caused by a systematic variation in the measured baseline prognostic factors, which can be appropriately addressed through statistical modeling. In this paper, we propose a Bayesian hierarchical model that can incorporate patient-level baseline covariates to enhance the appropriateness of the exchangeability assumption between current and historical control data. The performance of the proposed method is shown through simulation studies, and its application to a clinical trial design for amyotrophic lateral sclerosis is described. The proposed method is developed for scenarios involving multiple imbalanced prognostic factors and thus has meaningful implications for clinical trials evaluating new treatments for heterogeneous diseases such as amyotrophic lateral sclerosis. Copyright © 2017 John Wiley & Sons, Ltd.
Neumann, Anke; Billionnet, Cécile
2016-06-01
In observational studies without random assignment of the treatment, the unadjusted comparison between treatment groups may be misleading due to confounding. One method to adjust for measured confounders is inverse probability of treatment weighting. This method can also be used in the analysis of time to event data with competing risks. Competing risks arise if for some individuals the event of interest is precluded by a different type of event occurring before, or if only the earliest of several times to event, corresponding to different event types, is observed or is of interest. In the presence of competing risks, time to event data are often characterized by cumulative incidence functions, one for each event type of interest. We describe the use of inverse probability of treatment weighting to create adjusted cumulative incidence functions. This method is equivalent to direct standardization when the weight model is saturated. No assumptions about the form of the cumulative incidence functions are required. The method allows studying associations between treatment and the different types of event under study, while focusing on the earliest event only. We present a SAS macro implementing this method and we provide a worked example. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Fecteau, Gilles; Arsenault, Julie; Paré, Julie; Van Metre, David C; Holmberg, Charles A; Smith, Bradford P
2013-04-01
The objective of this study was to develop prediction models for the serum IgG concentration in critically ill calves based on indirect assays and to assess if the predictive ability of the models could be improved by inclusion of age, clinical covariates, and/or laboratory covariates. Seventy-eight critically ill calves between 1 and 13 days old were selected from 1 farm. Statistical models to predict IgG concentration from the results of the radial immunodiffusion test, the gold standard, were built as a function of indirect assays of serum and plasma protein concentrations, zinc sulfate (ZnSO4) turbidity and transmittance, and serum γ-glutamyl transferase (GGT) activity. For each assay 4 models were built: without covariates, with age, with age and clinical covariates (infection and dehydration status), and with age and laboratory covariates (fibrinogen concentration and packed cell volume). For the protein models, dehydration status (clinical model) and fibrinogen concentration (laboratory model) were selected for inclusion owing to their statistical significance. These variables increased the coefficient of determination (R (2) ) of the models by ≥ 7% but did not significantly improve the sensitivity or specificity of the models to predict passive transfer with a cutoff IgG concentration of 1000 mg/dL. For the GGT assay, including age as a covariate increased the R (2) of the model by 3%. For the ZnSO4 turbidity test, none of the covariates were statistically significant. Overall, the R (2) of the models ranged from 34% to 62%. This study has provided insight into the importance of adjusting for covariates when using indirect assays to predict IgG concentration in critically ill calves. Results also indicate that ZnSO4 transmittance and turbidity assays could be used advantageously in a field setting.
Nonparametric Combinatorial Sequence Models
NASA Astrophysics Data System (ADS)
Wauthier, Fabian L.; Jordan, Michael I.; Jojic, Nebojsa
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This paper presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three sequence datasets which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution induced by the prior. By integrating out the posterior our method compares favorably to leading binding predictors.
Sequential BART for imputation of missing covariates
Xu, Dandan; Daniels, Michael J.; Winterstein, Almut G.
2016-01-01
To conduct comparative effectiveness research using electronic health records (EHR), many covariates are typically needed to adjust for selection and confounding biases. Unfortunately, it is typical to have missingness in these covariates. Just using cases with complete covariates will result in considerable efficiency losses and likely bias. Here, we consider the covariates missing at random with missing data mechanism either depending on the response or not. Standard methods for multiple imputation can either fail to capture nonlinear relationships or suffer from the incompatibility and uncongeniality issues. We explore a flexible Bayesian nonparametric approach to impute the missing covariates, which involves factoring the joint distribution of the covariates with missingness into a set of sequential conditionals and applying Bayesian additive regression trees to model each of these univariate conditionals. Using data augmentation, the posterior for each conditional can be sampled simultaneously. We provide details on the computational algorithm and make comparisons to other methods, including parametric sequential imputation and two versions of multiple imputation by chained equations. We illustrate the proposed approach on EHR data from an affiliated tertiary care institution to examine factors related to hyperglycemia. PMID:26980459
An enhanced nonparametric streamflow disaggregation model with genetic algorithm
NASA Astrophysics Data System (ADS)
Lee, T.; Salas, J. D.; Prairie, J.
2010-08-01
Stochastic streamflow generation is generally utilized for planning and management of water resources systems. For this purpose, a number of parametric and nonparametric models have been suggested in literature. Among them, temporal and spatial disaggregation approaches play an important role particularly to make sure that historical variance-covariance properties are preserved at various temporal and spatial scales. In this paper, we review the underlying features of existing nonparametric disaggregation methods, identify some of their pros and cons, and propose a disaggregation algorithm that is capable of surmounting some of the shortcomings of the current models. The proposed models hinge on k-nearest neighbor resampling, the accurate adjusting procedure, and a genetic algorithm. The models have been tested and compared to an existing nonparametric disaggregation approach using data of the Colorado River system. It has been shown that the model is capable of (1) reproducing the season-to-season correlations including the correlation between the last season of the previous year and the first season of the current year, (2) minimizing or avoiding the generation of flow patterns across the year that are literally the same as those of the historical records, and (3) minimizing or avoiding the generation of negative flows. In addition, it is applicable to intermittent river regimes.
Huang, Lan; Tiwari, Ram C; Pickle, Linda W; Zou, Zhaohui
2010-10-15
In the field of cluster detection, a weighted normal model-based scan statistic was recently developed to analyze regional continuous data and to evaluate the clustering pattern of pre-defined cells (such as state, county, tract, school, hospital) that include many individuals. The continuous measures of interest are, for example, the survival rate, mortality rate, length of physical activity, or the obesity measure, namely, body mass index, at the cell level with an uncertainty measure for each cell. In this paper, we extend the method to search for clusters of the cells after adjusting for single/multiple categorical/continuous covariates. We apply the proposed method to 1999-2003 obesity data in the United States (US) collected by CDC's Behavioral Risk Factor Surveillance System with adjustment for age and race, and to 1999-2003 lung cancer age-adjusted mortality data by gender in the United States from the Surveillance Epidemiology and End Results (SEER Program) with adjustment for smoking and income.
NASA Astrophysics Data System (ADS)
Palmiotti, Giuseppe; Salvatores, Massimo; Hursin, Mathieu; Kodeli, Ivo; Gabrielli, Fabrizio; Hummel, Andrew
2017-09-01
A critical examination of the role of uncertainty assessment, target accuracies, role of integral experiment for validation and, consequently, of data adjustments methods is underway since several years at OECD-NEA, the objective being to provide criteria and practical approaches to use effectively the results of sensitivity analyses and cross section adjustments for feedback to evaluators and experimentalists in order to improve without ambiguities the knowledge of neutron cross sections, uncertainties, and correlations to be used in a wide range of applications and to meet new requirements and constraints for innovative reactor and fuel cycle system design. An approach will be described that expands as much as possible the use in the adjustment procedure of selected integral experiments that provide information on "elementary" phenomena, on separated individual physics effects related to specific isotopes or on specific energy ranges. An application to a large experimental data base has been performed and the results are discussed in the perspective of new evaluation projects like the CIELO initiative.
Palmiotti, Giuseppe; Salvatores, Massimo; Hursin, Mathieu; Kodeli, Ivo; Gabrielli, Fabrizio; Hummel, Andrew
2016-11-01
A critical examination of the role of uncertainty assessment, target accuracies, role of integral experiment for validation and, consequently, of data adjustments methods is underway since several years at OECD-NEA, the objective being to provide criteria and practical approaches to use effectively the results of sensitivity analyses and cross section adjustments for feedback to evaluators and experimentalists in order to improve without ambiguities the knowledge of neutron cross sections, uncertainties, and correlations to be used in a wide range of applications and to meet new requirements and constraints for innovative reactor and fuel cycle system design. An approach will be described that expands as much as possible the use in the adjustment procedure of selected integral experiments that provide information on “elementary” phenomena, on separated individual physics effects related to specific isotopes or on specific energy ranges. An application to a large experimental data base has been performed and the results are discussed in the perspective of new evaluation projects like the CIELO initiative.
Nonparametric identification experiment
NASA Technical Reports Server (NTRS)
Yam, Yeung
1988-01-01
The following constitutes a summary of this paper: on-orbit identification methodology starts with nonparametric techniques for a priori system identification; development of the nonparametric identification and model determination experiment software has been completed; the validation experiments to be performed on the JPL Control and Identification Technology Validation Laboratory have been designed.
Nonparametric Diagnostic Test for Conditional Logistic Regression
Goodman, Melody S.; Li, Yi
2012-01-01
The use of conditional logistic regression models to analyze matched case-control data has become standard in statistical analysis. However, methods to test the fit of these models has primarily focused on influential observations and the presence of outliers, while little attention has been given to the functional form of the covariates. In this paper we present methods to test the functional form of the covariates in the conditional logistic regression model, these methods are based on nonparametric smoothers. We assess the performance of the proposed methods via simulation studies and illustrate an example of their use on data from a community based intervention. PMID:23869287
Zhong, Sheng; McPeek, Mary Sara
2016-01-01
We consider the problem of genetic association testing of a binary trait in a sample that contains related individuals, where we adjust for relevant covariates and allow for missing data. We propose CERAMIC, an estimating equation approach that can be viewed as a hybrid of logistic regression and linear mixed-effects model (LMM) approaches. CERAMIC extends the recently proposed CARAT method to allow samples with related individuals and to incorporate partially missing data. In simulations, we show that CERAMIC outperforms existing LMM and generalized LMM approaches, maintaining high power and correct type 1 error across a wider range of scenarios. CERAMIC results in a particularly large power increase over existing methods when the sample includes related individuals with some missing data (e.g., when some individuals with phenotype and covariate information have missing genotype), because CERAMIC is able to make use of the relationship information to incorporate partially missing data in the analysis while correcting for dependence. Because CERAMIC is based on a retrospective analysis, it is robust to misspecification of the phenotype model, resulting in better control of type 1 error and higher power than that of prospective methods, such as GMMAT, when the phenotype model is misspecified. CERAMIC is computationally efficient for genomewide analysis in samples of related individuals of almost any configuration, including small families, unrelated individuals and even large, complex pedigrees. We apply CERAMIC to data on type 2 diabetes (T2D) from the Framingham Heart Study. In a genome scan, 9 of the 10 smallest CERAMIC p-values occur in or near either known T2D susceptibility loci or plausible candidates, verifying that CERAMIC is able to home in on the important loci in a genome scan. PMID:27695091
TGDA: Nonparametric Discriminant Analysis
ERIC Educational Resources Information Center
Pohl, Norval F.; Bruno, Albert V.
1976-01-01
A computer program for two-group nonparametric discriminant analysis is presented. Based on Bayes' Theorem for probability revision, the statistical rationale for this program uses the calculation of maximum likelihood estimates of group membership. The program compares the Bayesian procedure to the standard Linear Discriminant Function.…
TGDA: Nonparametric Discriminant Analysis
ERIC Educational Resources Information Center
Pohl, Norval F.; Bruno, Albert V.
1976-01-01
A computer program for two-group nonparametric discriminant analysis is presented. Based on Bayes' Theorem for probability revision, the statistical rationale for this program uses the calculation of maximum likelihood estimates of group membership. The program compares the Bayesian procedure to the standard Linear Discriminant Function.…
Nonparametric Conditional Estimation
1987-02-01
have a Brownian limit. Using von Mises’ method, asymptotic normality is obtained for nonparametric conditional estimates of compactly differentiable ... differentiable statistical functionals. This res~arch supported by Office of Naval Research Contract NOOOl4-83-K-0472; supported National Science Foundation...2.5 Models for F. 2.6 Compact Differentiability and von Mises’ Method 3. Consistency . 3.1 Introduction and Definitions 3.2 Prohorov Consistency of
Nonparametric conditional estimation
Owen, A.B.
1987-01-01
Many nonparametric regression techniques (such as kernels, nearest neighbors, and smoothing splines) estimate the conditional mean of Y given X = chi by a weighted sum of observed Y values, where observations with X values near chi tend to have larger weights. In this report the weights are taken to represent a finite signed measure on the space of Y values. This measure is studied as an estimate of the conditional distribution of Y given X = chi. From estimates of the conditional distribution, estimates of conditional means, standard deviations, quantiles and other statistical functionals may be computed. Chapter 1 illustrates the computation of conditional quantiles and conditional survival probabilities on the Stanford Heart Transplant data. Chapter 2 contains a survey of nonparametric regression methods and introduces statistical metrics and von Mises' method for later use. Chapter 3 proves some consistency results. Chapter 4 provides conditions under which the suitably normalized errors in estimating the conditional distribution of Y have a Brownian limit. Using von Mises' method, asymptotic normality is obtained for nonparametric conditional estimates of compactly differentiable statistical functionals.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
Bias associated with using the estimated propensity score as a regression covariate.
Hade, Erinn M; Lu, Bo
2014-01-15
The use of propensity score methods to adjust for selection bias in observational studies has become increasingly popular in public health and medical research. A substantial portion of studies using propensity score adjustment treat the propensity score as a conventional regression predictor. Through a Monte Carlo simulation study, Austin and colleagues. investigated the bias associated with treatment effect estimation when the propensity score is used as a covariate in nonlinear regression models, such as logistic regression and Cox proportional hazards models. We show that the bias exists even in a linear regression model when the estimated propensity score is used and derive the explicit form of the bias. We also conduct an extensive simulation study to compare the performance of such covariate adjustment with propensity score stratification, propensity score matching, inverse probability of treatment weighted method, and nonparametric functional estimation using splines. The simulation scenarios are designed to reflect real data analysis practice. Instead of specifying a known parametric propensity score model, we generate the data by considering various degrees of overlap of the covariate distributions between treated and control groups. Propensity score matching excels when the treated group is contained within a larger control pool, while the model-based adjustment may have an edge when treated and control groups do not have too much overlap. Overall, adjusting for the propensity score through stratification or matching followed by regression or using splines, appears to be a good practical strategy.
Nonparametric triple collocation
USDA-ARS?s Scientific Manuscript database
Triple collocation derives variance-covariance relationships between three or more independent measurement sources and an indirectly observed truth variable in the case where the measurement operators are linear-Gaussian. We generalize that theory to arbitrary observation operators by deriving nonpa...
ERIC Educational Resources Information Center
Zwick, Rebecca
1985-01-01
Describes how the test statistic for nonparametric one-way multivariate analysis of variance can be obtained by submitting the data to a packaged computer program. Monte Carlo evidence indicates that the nonparametric approach is advantageous under certain violations of the assumptions of multinormality and homogeneity of covariance matrices.…
A Bayesian nonparametric meta-analysis model.
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G
2015-03-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall effect size, such models may be adequate, but for prediction, they surely are not if the effect-size distribution exhibits non-normal behavior. To address this issue, we propose a Bayesian nonparametric meta-analysis model, which can describe a wider range of effect-size distributions, including unimodal symmetric distributions, as well as skewed and more multimodal distributions. We demonstrate our model through the analysis of real meta-analytic data arising from behavioral-genetic research. We compare the predictive performance of the Bayesian nonparametric model against various conventional and more modern normal fixed-effects and random-effects models. Copyright © 2014 John Wiley & Sons, Ltd.
Bayesian Nonparametric Models for Multiway Data Analysis.
Xu, Zenglin; Yan, Feng; Qi, Yuan
2015-02-01
Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels.
Nonparametric Bayes analysis of social science data
NASA Astrophysics Data System (ADS)
Kunihama, Tsuyoshi
Social science data often contain complex characteristics that standard statistical methods fail to capture. Social surveys assign many questions to respondents, which often consist of mixed-scale variables. Each of the variables can follow a complex distribution outside parametric families and associations among variables may have more complicated structures than standard linear dependence. Therefore, it is not straightforward to develop a statistical model which can approximate structures well in the social science data. In addition, many social surveys have collected data over time and therefore we need to incorporate dynamic dependence into the models. Also, it is standard to observe massive number of missing values in the social science data. To address these challenging problems, this thesis develops flexible nonparametric Bayesian methods for the analysis of social science data. Chapter 1 briefly explains backgrounds and motivations of the projects in the following chapters. Chapter 2 develops a nonparametric Bayesian modeling of temporal dependence in large sparse contingency tables, relying on a probabilistic factorization of the joint pmf. Chapter 3 proposes nonparametric Bayes inference on conditional independence with conditional mutual information used as a measure of the strength of conditional dependence. Chapter 4 proposes a novel Bayesian density estimation method in social surveys with complex designs where there is a gap between sample and population. We correct for the bias by adjusting mixture weights in Bayesian mixture models. Chapter 5 develops a nonparametric model for mixed-scale longitudinal surveys, in which various types of variables can be induced through latent continuous variables and dynamic latent factors lead to flexibly time-varying associations among variables.
Agogo, George O
2017-01-01
Measurement error in exposure variables is a serious impediment in epidemiological studies that relate exposures to health outcomes. In nutritional studies, interest could be in the association between long-term dietary intake and disease occurrence. Long-term intake is usually assessed with food frequency questionnaire (FFQ), which is prone to recall bias. Measurement error in FFQ-reported intakes leads to bias in parameter estimate that quantifies the association. To adjust for bias in the association, a calibration study is required to obtain unbiased intake measurements using a short-term instrument such as 24-hour recall (24HR). The 24HR intakes are used as response in regression calibration to adjust for bias in the association. For foods not consumed daily, 24HR-reported intakes are usually characterized by excess zeroes, right skewness, and heteroscedasticity posing serious challenge in regression calibration modeling. We proposed a zero-augmented calibration model to adjust for measurement error in reported intake, while handling excess zeroes, skewness, and heteroscedasticity simultaneously without transforming 24HR intake values. We compared the proposed calibration method with the standard method and with methods that ignore measurement error by estimating long-term intake with 24HR and FFQ-reported intakes. The comparison was done in real and simulated datasets. With the 24HR, the mean increase in mercury level per ounce fish intake was about 0.4; with the FFQ intake, the increase was about 1.2. With both calibration methods, the mean increase was about 2.0. Similar trend was observed in the simulation study. In conclusion, the proposed calibration method performs at least as good as the standard method.
Agogo, George O.
2017-01-01
Measurement error in exposure variables is a serious impediment in epidemiological studies that relate exposures to health outcomes. In nutritional studies, interest could be in the association between long-term dietary intake and disease occurrence. Long-term intake is usually assessed with food frequency questionnaire (FFQ), which is prone to recall bias. Measurement error in FFQ-reported intakes leads to bias in parameter estimate that quantifies the association. To adjust for bias in the association, a calibration study is required to obtain unbiased intake measurements using a short-term instrument such as 24-hour recall (24HR). The 24HR intakes are used as response in regression calibration to adjust for bias in the association. For foods not consumed daily, 24HR-reported intakes are usually characterized by excess zeroes, right skewness, and heteroscedasticity posing serious challenge in regression calibration modeling. We proposed a zero-augmented calibration model to adjust for measurement error in reported intake, while handling excess zeroes, skewness, and heteroscedasticity simultaneously without transforming 24HR intake values. We compared the proposed calibration method with the standard method and with methods that ignore measurement error by estimating long-term intake with 24HR and FFQ-reported intakes. The comparison was done in real and simulated datasets. With the 24HR, the mean increase in mercury level per ounce fish intake was about 0.4; with the FFQ intake, the increase was about 1.2. With both calibration methods, the mean increase was about 2.0. Similar trend was observed in the simulation study. In conclusion, the proposed calibration method performs at least as good as the standard method. PMID:27704599
Parametrically guided estimation in nonparametric varying coefficient models with quasi-likelihood
Davenport, Clemontina A.; Maity, Arnab; Wu, Yichao
2015-01-01
Varying coefficient models allow us to generalize standard linear regression models to incorporate complex covariate effects by modeling the regression coefficients as functions of another covariate. For nonparametric varying coefficients, we can borrow the idea of parametrically guided estimation to improve asymptotic bias. In this paper, we develop a guided estimation procedure for the nonparametric varying coefficient models. Asymptotic properties are established for the guided estimators and a method of bandwidth selection via bias-variance tradeoff is proposed. We compare the performance of the guided estimator with that of the unguided estimator via both simulation and real data examples. PMID:26146469
Parametrically guided estimation in nonparametric varying coefficient models with quasi-likelihood.
Davenport, Clemontina A; Maity, Arnab; Wu, Yichao
2015-04-01
Varying coefficient models allow us to generalize standard linear regression models to incorporate complex covariate effects by modeling the regression coefficients as functions of another covariate. For nonparametric varying coefficients, we can borrow the idea of parametrically guided estimation to improve asymptotic bias. In this paper, we develop a guided estimation procedure for the nonparametric varying coefficient models. Asymptotic properties are established for the guided estimators and a method of bandwidth selection via bias-variance tradeoff is proposed. We compare the performance of the guided estimator with that of the unguided estimator via both simulation and real data examples.
Partially Linear Varying Coefficient Models Stratified by a Functional Covariate
Maity, Arnab; Huang, Jianhua Z.
2012-01-01
We consider the problem of estimation in semiparametric varying coefficient models where the covariate modifying the varying coefficients is functional and is modeled nonparametrically. We develop a kernel-based estimator of the nonparametric component and a profiling estimator of the parametric component of the model and derive their asymptotic properties. Specifically, we show the consistency of the nonparametric functional estimates and derive the asymptotic expansion of the estimates of the parametric component. We illustrate the performance of our methodology using a simulation study and a real data application. PMID:22904586
Marginally specified priors for non-parametric Bayesian estimation
Kessler, David C.; Hoff, Peter D.; Dunson, David B.
2014-01-01
Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
Marginally specified priors for non-parametric Bayesian estimation.
Kessler, David C; Hoff, Peter D; Dunson, David B
2015-01-01
Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables.
ERIC Educational Resources Information Center
Alterman, Arthur I.; Cacciola, John S.; Habing, Brian; Lynch, Kevin G.
2007-01-01
Baseline Addiction Severity Index (5th ed.; ASI-5) data of 2,142 substance abuse patients were analyzed with two nonparametric item response theory (NIRT) methods: Mokken scaling and conditional covariance techniques. Nine reliable and dimensionally homogeneous Recent Problem indexes emerged in the ASI-5's seven areas, including two each in the…
A Simple Class of Bayesian Nonparametric Autoregression Models.
Di Lucca, Maria Anna; Guglielmi, Alessandra; Müller, Peter; Quintana, Fernando A
2013-03-01
We introduce a model for a time series of continuous outcomes, that can be expressed as fully nonparametric regression or density regression on lagged terms. The model is based on a dependent Dirichlet process prior on a family of random probability measures indexed by the lagged covariates. The approach is also extended to sequences of binary responses. We discuss implementation and applications of the models to a sequence of waiting times between eruptions of the Old Faithful Geyser, and to a dataset consisting of sequences of recurrence indicators for tumors in the bladder of several patients.
Conditional Covariance-based Representation of Multidimensional Test Structure.
ERIC Educational Resources Information Center
Bolt, Daniel M.
2001-01-01
Presents a new nonparametric method for constructing a spatial representation of multidimensional test structure, the Conditional Covariance-based SCALing (CCSCAL) method. Describes an index to measure the accuracy of the representation. Uses simulation and real-life data analyses to show that the method provides a suitable approximation to…
Astronomical Methods for Nonparametric Regression
NASA Astrophysics Data System (ADS)
Steinhardt, Charles L.; Jermyn, Adam
2017-01-01
I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.
Daye, Z. John; Xie, Jichun; Li, Hongzhe
2012-01-01
Many problems in genomics are related to variable selection where high-dimensional genomic data are treated as covariates. Such genomic covariates often have certain structures and can be represented as vertices of an undirected graph. Biological processes also vary as functions depending upon some biological state, such as time. High-dimensional variable selection where covariates are graph-structured and underlying model is nonparametric presents an important but largely unaddressed statistical challenge. Motivated by the problem of regression-based motif discovery, we consider the problem of variable selection for high-dimensional nonparametric varying-coefficient models and introduce a sparse structured shrinkage (SSS) estimator based on basis function expansions and a novel smoothed penalty function. We present an efficient algorithm for computing the SSS estimator. Results on model selection consistency and estimation bounds are derived. Moreover, finite-sample performances are studied via simulations, and the effects of high-dimensionality and structural information of the covariates are especially highlighted. We apply our method to motif finding problem using a yeast cell-cycle gene expression dataset and word counts in genes’ promoter sequences. Our results demonstrate that the proposed method can result in better variable selection and prediction for high-dimensional regression when the underlying model is nonparametric and covariates are structured. Supplemental materials for the article are available online. PMID:22904608
Bayesian non-parametrics and the probabilistic approach to modelling
Ghahramani, Zoubin
2013-01-01
Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609
Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models
Fan, Jianqing; Ma, Yunbei; Dai, Wei
2014-01-01
The varying-coefficient model is an important class of nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is large, the issue of variable selection arises. In this paper, we propose and investigate marginal nonparametric screening methods to screen variables in sparse ultra-high dimensional varying-coefficient models. The proposed nonparametric independence screening (NIS) selects variables by ranking a measure of the nonparametric marginal contributions of each covariate given the exposure variable. The sure independent screening property is established under some mild technical conditions when the dimensionality is of nonpolynomial order, and the dimensionality reduction of NIS is quantified. To enhance the practical utility and finite sample performance, two data-driven iterative NIS methods are proposed for selecting thresholding parameters and variables: conditional permutation and greedy methods, resulting in Conditional-INIS and Greedy-INIS. The effectiveness and flexibility of the proposed methods are further illustrated by simulation studies and real data applications. PMID:25309009
Tatarinova, Tatiana; Neely, Michael; Bartroff, Jay; van Guilder, Michael; Yamada, Walter; Bayard, David; Jelliffe, Roger; Leary, Robert; Chubatiuk, Alyona; Schumitzky, Alan
2013-04-01
Population pharmacokinetic (PK) modeling methods can be statistically classified as either parametric or nonparametric (NP). Each classification can be divided into maximum likelihood (ML) or Bayesian (B) approaches. In this paper we discuss the nonparametric case using both maximum likelihood and Bayesian approaches. We present two nonparametric methods for estimating the unknown joint population distribution of model parameter values in a pharmacokinetic/pharmacodynamic (PK/PD) dataset. The first method is the NP Adaptive Grid (NPAG). The second is the NP Bayesian (NPB) algorithm with a stick-breaking process to construct a Dirichlet prior. Our objective is to compare the performance of these two methods using a simulated PK/PD dataset. Our results showed excellent performance of NPAG and NPB in a realistically simulated PK study. This simulation allowed us to have benchmarks in the form of the true population parameters to compare with the estimates produced by the two methods, while incorporating challenges like unbalanced sample times and sample numbers as well as the ability to include the covariate of patient weight. We conclude that both NPML and NPB can be used in realistic PK/PD population analysis problems. The advantages of one versus the other are discussed in the paper. NPAG and NPB are implemented in R and freely available for download within the Pmetrics package from www.lapk.org.
Neely, Michael; Bartroff, Jay; van Guilder, Michael; Yamada, Walter; Bayard, David; Jelliffe, Roger; Leary, Robert; Chubatiuk, Alyona; Schumitzky, Alan
2013-01-01
Population pharmacokinetic (PK) modeling methods can be statistically classified as either parametric or nonparametric (NP). Each classification can be divided into maximum likelihood (ML) or Bayesian (B) approazches. In this paper we discuss the nonparametric case using both maximum likelihood and Bayesian approaches. We present two nonparametric methods for estimating the unknown joint population distribution of model parameter values in a pharmacokinetic/pharmacodynamic (PK/PD) dataset. The first method is the NP Adaptive Grid (NPAG). The second is the NP Bayesian (NPB) algorithm with a stick-breaking process to construct a Dirichlet prior. Our objective is to compare the performance of these two methods using a simulated PK/PD dataset. Our results showed excellent performance of NPAG and NPB in a realistically simulated PK study. This simulation allowed us to have benchmarks in the form of the true population parameters to compare with the estimates produced by the two methods, while incorporating challenges like unbalanced sample times and sample numbers as well as the ability to include the covariate of patient weight. We conclude that both NPML and NPB can be used in realistic PK/PD population analysis problems. The advantages of one versus the other are discussed in the paper. NPAG and NPB are implemented in R and freely available for download within the Pmetrics package from www.lapk.org. PMID:23404393
Variable Selection for Nonparametric Quantile Regression via Smoothing Spline AN OVA
Lin, Chen-Yen; Bondell, Howard; Zhang, Hao Helen; Zou, Hui
2014-01-01
Quantile regression provides a more thorough view of the effect of covariates on a response. Nonparametric quantile regression has become a viable alternative to avoid restrictive parametric assumption. The problem of variable selection for quantile regression is challenging, since important variables can influence various quantiles in different ways. We tackle the problem via regularization in the context of smoothing spline ANOVA models. The proposed sparse nonparametric quantile regression (SNQR) can identify important variables and provide flexible estimates for quantiles. Our numerical study suggests the promising performance of the new procedure in variable selection and function estimation. Supplementary materials for this article are available online. PMID:24554792
Statistical sirens: The allure of nonparametrics
Johnson, Douglas H.
1995-01-01
Although nonparametric statistical methods have a role to play in the analysis of data, often their virtues are overstated and their deficiencies overlooked. A recent Special Feature in Ecology advocated nonparametric methods because of an erroneously stated advantage that they require no assumptions regarding the distribution underlying the observations. The present paper points out some often—ignored feature of nonparametric tests comparing two means, and advocates parameter estimation as a preferred alternative to hypothesis testing in many situations.
Multiatlas Segmentation as Nonparametric Regression
Awate, Suyash P.; Whitaker, Ross T.
2015-01-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator’s convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528
Multiatlas segmentation as nonparametric regression.
Awate, Suyash P; Whitaker, Ross T
2014-09-01
This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems.
Covariate Imbalance and Precision in Measuring Treatment Effects
ERIC Educational Resources Information Center
Liu, Xiaofeng Steven
2011-01-01
Covariate adjustment can increase the precision of estimates by removing unexplained variance from the error in randomized experiments, although chance covariate imbalance tends to counteract the improvement in precision. The author develops an easy measure to examine chance covariate imbalance in randomization by standardizing the average…
Using analysis of covariance (ANCOVA) with fallible covariates.
Culpepper, Steven Andrew; Aguinis, Herman
2011-06-01
Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but use ANCOVA anyway (and, most likely, report misleading results); (b) attempt to employ 1 of several measurement error models with the understanding that no research has examined their relative performance and with the added practical difficulty that several of these models are not available in commonly used statistical software; or (c) not use ANCOVA at all. First, we discuss analytic evidence to explain why using ANCOVA with fallible covariates produces bias and a systematic inflation of Type I error rates that may lead to the incorrect conclusion that treatment effects exist. Second, to provide a solution for this problem, we conduct 2 Monte Carlo studies to compare 4 existing approaches for adjusting treatment effects in the presence of covariate measurement error: errors-in-variables (EIV; Warren, White, & Fuller, 1974), Lord's (1960) method, Raaijmakers and Pieters's (1987) method (R&P), and structural equation modeling methods proposed by Sörbom (1978) and Hayduk (1996). Results show that EIV models are superior in terms of parameter accuracy, statistical power, and keeping Type I error close to the nominal value. Finally, we offer a program written in R that performs all needed computations for implementing EIV models so that ANCOVA can be used to obtain accurate results even when covariates are measured with error. © 2011 American Psychological Association
1982-01-01
DAAG29-80-C-0041. (1959) and Dross (1966) adjust only for the unmeasured covariate u, whereas we adjust for measured covariates in addition to the...unmeasured covariate ui second, Cornfield, et. al. (1959) and Dross (1966) only judge whether the effect of the treatment could be zero having adjusted
Nonparametric weighted feature extraction for noise whitening least squares
NASA Astrophysics Data System (ADS)
Ren, Hsuan; Chi, Wan-Wei; Pan, Yen-Nan
2007-09-01
The Least Square (LS) approach is one of the most widely used algorithms for target detection in remote sensing images. It has been proven mathematically that the Noise Whitened Least Square (NWLS) can outperform the original version by making the noise distribution independent and identical distributed (i.i.d.). But in order to have good results, the estimation of the noise covariance matrix is very important and still remains a great challenge. Many estimation methods have been proposed in the past. The first type of methods assumes that the signal between neighbor pixels should be similar, so that the difference between neighborhood pixels or the high-frequency signals can be used to represent noise. These includes spatial and frequency domain high-pass filter, neighborhood pixel subtraction. The more practical method is based on the training samples and calculates the covariance matrix between each training sample and its class mean as the noise distribution, which is the within-class scatter matrix in Fisher's Linear Discriminant Analysis. But it is usually not easy to collect enough training samples to yield full rank covariance matrix. In this paper, we adopt the Nonparametric Weighted Feature Extraction (NWFE) to overcome the rank problem and it is also suitable to model the non-Gaussian noise. We have also compared the results with SPOT-5 image scene.
Bayesian inference for longitudinal data with non-parametric treatment effects.
Müller, Peter; Quintana, Fernando A; Rosner, Gary L; Maitland, Michael L
2014-04-01
We consider inference for longitudinal data based on mixed-effects models with a non-parametric Bayesian prior on the treatment effect. The proposed non-parametric Bayesian prior is a random partition model with a regression on patient-specific covariates. The main feature and motivation for the proposed model is the use of covariates with a mix of different data formats and possibly high-order interactions in the regression. The regression is not explicitly parameterized. It is implied by the random clustering of subjects. The motivating application is a study of the effect of an anticancer drug on a patient's blood pressure. The study involves blood pressure measurements taken periodically over several 24-h periods for 54 patients. The 24-h periods for each patient include a pretreatment period and several occasions after the start of therapy.
Choosing covariates in the analysis of clinical trials.
Beach, M L; Meier, P
1989-12-01
Much of the literature on clinical trials emphasizes the importance of adjusting the results for any covariates (baseline variables) for which randomization fails to produce nearly exact balance, but the literature is very nearly devoid of recipes for assessing the consequences of such adjustments. Several years ago, Paul Canner presented an approximate expression for the effect of a covariate adjustment, and he considered its use in the selection of covariates. With the aid of Canner's equation, using both formal analysis and simulation, the impact of covariate adjustment is further explored. Unless tight control over the analysis plans is established in advance, covariate adjustment can lead to seriously misleading inferences. Illustrations from the clinical trials literature are provided.
ERIC Educational Resources Information Center
Bolt, Daniel; Roussos, Louis; Stout, William
Several nonparametric dimensionality assessment tools have demonstrated the usefulness of item pair conditional covariances as building blocks for investigating multidimensional test structure. Recently, J. Zhang and W. Stout (1999) have related the structural properties of conditional covariances in a generalized compensatory framework to a test…
Nonparametric Methods in Molecular Biology
Wittkowski, Knut M.; Song, Tingting
2010-01-01
In 2003, the completion of the Human Genome Project[1] together with advances in computational resources[2] were expected to launch an era where the genetic and genomic contributions to many common diseases would be found. In the years following, however, researchers became increasingly frustrated as most reported ‘findings’ could not be replicated in independent studies[3]. To improve the signal/noise ratio, it was suggested to increase the number of cases to be included to tens of thousands[4], a requirement that would dramatically restrict the scope of personalized medicine. Similarly, there was little success in elucidating the gene–gene interactions involved in complex diseases or even in developing criteria for assessing their phenotypes. As a partial solution to these enigmata, we here introduce a class of statistical methods as the ‘missing link’ between advances in genetics and informatics. As a first step, we provide a unifying view of a plethora of non-parametric tests developed mainly in the 1940s, all of which can be expressed as u-statistics. Then, we will extend this approach to reflect categorical and ordinal relationships between variables, resulting in a flexible and powerful approach to deal with the impact of (1) multi-allelic genetic loci, (2) poly-locus genetic regions, and (3) oligo-genetic and oligo-genomic collaborative interactions on complex phenotypes. PMID:20652502
Non-Parametric Collision Probability for Low-Velocity Encounters
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell
2007-01-01
An implicit, but not necessarily obvious, assumption in all of the current techniques for assessing satellite collision probability is that the relative position uncertainty is perfectly correlated in time. If there is any mis-modeling of the dynamics in the propagation of the relative position error covariance matrix, time-wise de-correlation of the uncertainty will increase the probability of collision over a given time interval. The paper gives some examples that illustrate this point. This paper argues that, for the present, Monte Carlo analysis is the best available tool for handling low-velocity encounters, and suggests some techniques for addressing the issues just described. One proposal is for the use of a non-parametric technique that is widely used in actuarial and medical studies. The other suggestion is that accurate process noise models be used in the Monte Carlo trials to which the non-parametric estimate is applied. A further contribution of this paper is a description of how the time-wise decorrelation of uncertainty increases the probability of collision.
Evaluation of the Covariance Matrix of Estimated Resonance Parameters
NASA Astrophysics Data System (ADS)
Becker, B.; Capote, R.; Kopecky, S.; Massimi, C.; Schillebeeckx, P.; Sirakov, I.; Volev, K.
2014-04-01
In the resonance region nuclear resonance parameters are mostly obtained by a least square adjustment of a model to experimental data. Derived parameters can be mutually correlated through the adjustment procedure as well as through common experimental or model uncertainties. In this contribution we investigate four different methods to propagate the additional covariance caused by experimental or model uncertainties into the evaluation of the covariance matrix of the estimated parameters: (1) including the additional covariance into the experimental covariance matrix based on calculated or theoretical estimates of the data; (2) including the uncertainty affected parameter in the adjustment procedure; (3) evaluation of the full covariance matrix by Monte Carlo sampling of the common parameter; and (4) retroactively including the additional covariance by using the marginalization procedure of Habert et al.
Elementary Estimates: An Introduction to Nonparametrics.
ERIC Educational Resources Information Center
Noether, Gottfried E.
1985-01-01
The paper presents a unified approach to some of the more popular nonparametric methods in current use, providing the reader with new insights by exhibiting relationships to relevant population parameters. (Author/LMO)
NASA Technical Reports Server (NTRS)
Hepner, T. E.; Meyers, J. F. (Inventor)
1985-01-01
A laser velocimeter covariance processor which calculates the auto covariance and cross covariance functions for a turbulent flow field based on Poisson sampled measurements in time from a laser velocimeter is described. The device will process a block of data that is up to 4096 data points in length and return a 512 point covariance function with 48-bit resolution along with a 512 point histogram of the interarrival times which is used to normalize the covariance function. The device is designed to interface and be controlled by a minicomputer from which the data is received and the results returned. A typical 4096 point computation takes approximately 1.5 seconds to receive the data, compute the covariance function, and return the results to the computer.
NON-PARAMETRIC ESTIMATION UNDER STRONG DEPENDENCE
Zhao, Zhibiao; Zhang, Yiyun; Li, Runze
2014-01-01
We study non-parametric regression function estimation for models with strong dependence. Compared with short-range dependent models, long-range dependent models often result in slower convergence rates. We propose a simple differencing-sequence based non-parametric estimator that achieves the same convergence rate as if the data were independent. Simulation studies show that the proposed method has good finite sample performance. PMID:25018572
NON-PARAMETRIC ESTIMATION UNDER STRONG DEPENDENCE.
Zhao, Zhibiao; Zhang, Yiyun; Li, Runze
2014-01-01
We study non-parametric regression function estimation for models with strong dependence. Compared with short-range dependent models, long-range dependent models often result in slower convergence rates. We propose a simple differencing-sequence based non-parametric estimator that achieves the same convergence rate as if the data were independent. Simulation studies show that the proposed method has good finite sample performance.
Galilean covariant harmonic oscillator
NASA Technical Reports Server (NTRS)
Horzela, Andrzej; Kapuscik, Edward
1993-01-01
A Galilean covariant approach to classical mechanics of a single particle is described. Within the proposed formalism, all non-covariant force laws defining acting forces which become to be defined covariantly by some differential equations are rejected. Such an approach leads out of the standard classical mechanics and gives an example of non-Newtonian mechanics. It is shown that the exactly solvable linear system of differential equations defining forces contains the Galilean covariant description of harmonic oscillator as its particular case. Additionally, it is demonstrated that in Galilean covariant classical mechanics the validity of the second Newton law of dynamics implies the Hooke law and vice versa. It is shown that the kinetic and total energies transform differently with respect to the Galilean transformations.
Why preferring parametric forecasting to nonparametric methods?
Jabot, Franck
2015-05-07
A recent series of papers by Charles T. Perretti and collaborators have shown that nonparametric forecasting methods can outperform parametric methods in noisy nonlinear systems. Such a situation can arise because of two main reasons: the instability of parametric inference procedures in chaotic systems which can lead to biased parameter estimates, and the discrepancy between the real system dynamics and the modeled one, a problem that Perretti and collaborators call "the true model myth". Should ecologists go on using the demanding parametric machinery when trying to forecast the dynamics of complex ecosystems? Or should they rely on the elegant nonparametric approach that appears so promising? It will be here argued that ecological forecasting based on parametric models presents two key comparative advantages over nonparametric approaches. First, the likelihood of parametric forecasting failure can be diagnosed thanks to simple Bayesian model checking procedures. Second, when parametric forecasting is diagnosed to be reliable, forecasting uncertainty can be estimated on virtual data generated with the fitted to data parametric model. In contrast, nonparametric techniques provide forecasts with unknown reliability. This argumentation is illustrated with the simple theta-logistic model that was previously used by Perretti and collaborators to make their point. It should convince ecologists to stick to standard parametric approaches, until methods have been developed to assess the reliability of nonparametric forecasting.
Nonparametric Bayesian methods for benchmark dose estimation.
Guha, Nilabja; Roy, Anindya; Kopylev, Leonid; Fox, John; Spassova, Maria; White, Paul
2013-09-01
The article proposes and investigates the performance of two Bayesian nonparametric estimation procedures in the context of benchmark dose estimation in toxicological animal experiments. The methodology is illustrated using several existing animal dose-response data sets and is compared with traditional parametric methods available in standard benchmark dose estimation software (BMDS), as well as with a published model-averaging approach and a frequentist nonparametric approach. These comparisons together with simulation studies suggest that the nonparametric methods provide a lot of flexibility in terms of model fit and can be a very useful tool in benchmark dose estimation studies, especially when standard parametric models fail to fit to the data adequately. © 2013 Society for Risk Analysis.
Pipe performance analysis with nonparametric regression
NASA Astrophysics Data System (ADS)
Liu, Zheng; Hu, Yafei; Wu, Wei
2011-04-01
Asbestos cement (AC) water mains were installed extensively in North America, Europe, and Australia during 1920s-1980s and subject to a high breakage rate in recent years in some utilities. It is essential to understand how the influential factors contribute to the degradation and failure of AC pipes. The historical failure data collected from twenty utilities are used in this study to explore the correlation between pipe condition and its working environment. In this paper, we applied four nonparametric regression methods to model the relationship between pipe failure represented by average break rates and influential variables including pipe age and internal and external working environmental parameters. The nonparametric regression models do not take a predetermined form but it needs information derived from data. The feasibility of using a nonparametric regression model for the condition assessment of AC pipes is investigated and understood.
A Review of DIMPACK Version 1.0: Conditional Covariance-Based Test Dimensionality Analysis Package
ERIC Educational Resources Information Center
Deng, Nina; Han, Kyung T.; Hambleton, Ronald K.
2013-01-01
DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional covariance approach is reviewed. This software was originally distributed by Assessment Systems Corporation and now can be freely accessed online. The software consists of Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC, which…
ERIC Educational Resources Information Center
Headrick, Todd C.; Vineyard, George
The Type I error and power properties of the parametric F test and three nonparametric competitors were compared in terms of 3 x 4 factorial analysis of covariance layout. The focus of the study was on the test for interaction either in the presence or absence of main effects. A variety of conditional distributions, sample sizes, levels of variate…
A Review of DIMPACK Version 1.0: Conditional Covariance-Based Test Dimensionality Analysis Package
ERIC Educational Resources Information Center
Deng, Nina; Han, Kyung T.; Hambleton, Ronald K.
2013-01-01
DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional covariance approach is reviewed. This software was originally distributed by Assessment Systems Corporation and now can be freely accessed online. The software consists of Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC, which…
Covariant mutually unbiased bases
NASA Astrophysics Data System (ADS)
Carmeli, Claudio; Schultz, Jussi; Toigo, Alessandro
2016-06-01
The connection between maximal sets of mutually unbiased bases (MUBs) in a prime-power dimensional Hilbert space and finite phase-space geometries is well known. In this article, we classify MUBs according to their degree of covariance with respect to the natural symmetries of a finite phase-space, which are the group of its affine symplectic transformations. We prove that there exist maximal sets of MUBs that are covariant with respect to the full group only in odd prime-power dimensional spaces, and in this case, their equivalence class is actually unique. Despite this limitation, we show that in dimension 2r covariance can still be achieved by restricting to proper subgroups of the symplectic group, that constitute the finite analogues of the oscillator group. For these subgroups, we explicitly construct the unitary operators yielding the covariance.
Covariant Noncommutative Field Theory
Estrada-Jimenez, S.; Garcia-Compean, H.; Obregon, O.; Ramirez, C.
2008-07-02
The covariant approach to noncommutative field and gauge theories is revisited. In the process the formalism is applied to field theories invariant under diffeomorphisms. Local differentiable forms are defined in this context. The lagrangian and hamiltonian formalism is consistently introduced.
Multivariate spatial nonparametric modelling via kernel processes mixing*
Fuentes, Montserrat; Reich, Brian
2013-01-01
SUMMARY In this paper we develop a nonparametric multivariate spatial model that avoids specifying a Gaussian distribution for spatial random effects. Our nonparametric model extends the stick-breaking (SB) prior of Sethuraman (1994), which is frequently used in Bayesian modelling to capture uncertainty in the parametric form of an outcome. The stick-breaking prior is extended here to the spatial setting by assigning each location a different, unknown distribution, and smoothing the distributions in space with a series of space-dependent kernel functions that have a space-varying bandwidth parameter. This results in a flexible non-stationary spatial model, as different kernel functions lead to different relationships between the distributions at nearby locations. This approach is the first to allow both the probabilities and the point mass values of the SB prior to depend on space. Thus, there is no need for replications and we obtain a continuous process in the limit. We extend the model to the multivariate setting by having for each process a different kernel function, but sharing the location of the kernel knots across the different processes. The resulting covariance for the multivariate process is in general nonstationary and nonseparable. The modelling framework proposed here is also computationally efficient because it avoids inverting large matrices and calculating determinants, which often hinders the spatial analysis of large data sets. We study the theoretical properties of the proposed multivariate spatial process. The methods are illustrated using simulated examples and an air pollution application to model components of fine particulate matter. PMID:24347994
Nonparametric analysis of high wind speed data
NASA Astrophysics Data System (ADS)
Francisco-Fernández, Mario; Quintela-del-Río, Alejandro
2013-01-01
In this paper, nonparametric curve estimation methods are applied to analyze time series of wind speeds, focusing on the extreme events exceeding a chosen threshold. Classical parametric statistical approaches in this context consist in fitting a generalized Pareto distribution (GPD) to the tail of the empirical cumulative distribution, using maximum likelihood or the method of the moments to estimate the parameters of this distribution. Additionally, confidence intervals are usually computed to assess the uncertainty of the estimates. Nonparametric methods to estimate directly some quantities of interest, such as the probability of exceedance, the quantiles or return levels, or the return periods, are proposed. Moreover, bootstrap techniques are used to develop pointwise and simultaneous confidence intervals for these functions. The proposed models are applied to wind speed data in the Gulf Coast of US, comparing the results with those using the GPD approach, by means of a split-sample test. Results show that nonparametric methods are competitive with respect to the standard GPD approximations. The study is completed generating synthetic data sets and comparing the behavior of the parametric and the nonparametric estimates in this framework.
A Comparison of Parametric versus Nonparametric Statistics.
ERIC Educational Resources Information Center
Royeen, Charlotte Brasic
In order to examine the possible effects of violation of assumptions using parametric procedures, this study is an exploratory investigation into the use of parametric versus nonparametric procedures using a multiple case study design. The case study investigation guidelines outlined by Yin served as the methodology. The following univariate…
How Are Teachers Teaching? A Nonparametric Approach
ERIC Educational Resources Information Center
De Witte, Kristof; Van Klaveren, Chris
2014-01-01
This paper examines which configuration of teaching activities maximizes student performance. For this purpose a nonparametric efficiency model is formulated that accounts for (1) self-selection of students and teachers in better schools and (2) complementary teaching activities. The analysis distinguishes both individual teaching (i.e., a…
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
How Are Teachers Teaching? A Nonparametric Approach
ERIC Educational Resources Information Center
De Witte, Kristof; Van Klaveren, Chris
2014-01-01
This paper examines which configuration of teaching activities maximizes student performance. For this purpose a nonparametric efficiency model is formulated that accounts for (1) self-selection of students and teachers in better schools and (2) complementary teaching activities. The analysis distinguishes both individual teaching (i.e., a…
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
Covariant Bardeen perturbation formalism
NASA Astrophysics Data System (ADS)
Vitenti, S. D. P.; Falciano, F. T.; Pinto-Neto, N.
2014-05-01
In a previous work we obtained a set of necessary conditions for the linear approximation in cosmology. Here we discuss the relations of this approach with the so-called covariant perturbations. It is often argued in the literature that one of the main advantages of the covariant approach to describe cosmological perturbations is that the Bardeen formalism is coordinate dependent. In this paper we will reformulate the Bardeen approach in a completely covariant manner. For that, we introduce the notion of pure and mixed tensors, which yields an adequate language to treat both perturbative approaches in a common framework. We then stress that in the referred covariant approach, one necessarily introduces an additional hypersurface choice to the problem. Using our mixed and pure tensors approach, we are able to construct a one-to-one map relating the usual gauge dependence of the Bardeen formalism with the hypersurface dependence inherent to the covariant approach. Finally, through the use of this map, we define full nonlinear tensors that at first order correspond to the three known gauge invariant variables Φ, Ψ and Ξ, which are simultaneously foliation and gauge invariant. We then stress that the use of the proposed mixed tensors allows one to construct simultaneously gauge and hypersurface invariant variables at any order.
NASA Astrophysics Data System (ADS)
Frasinski, Leszek J.
2016-08-01
Recent technological advances in the generation of intense femtosecond pulses have made covariance mapping an attractive analytical technique. The laser pulses available are so intense that often thousands of ionisation and Coulomb explosion events will occur within each pulse. To understand the physics of these processes the photoelectrons and photoions need to be correlated, and covariance mapping is well suited for operating at the high counting rates of these laser sources. Partial covariance is particularly useful in experiments with x-ray free electron lasers, because it is capable of suppressing pulse fluctuation effects. A variety of covariance mapping methods is described: simple, partial (single- and multi-parameter), sliced, contingent and multi-dimensional. The relationship to coincidence techniques is discussed. Covariance mapping has been used in many areas of science and technology: inner-shell excitation and Auger decay, multiphoton and multielectron ionisation, time-of-flight and angle-resolved spectrometry, infrared spectroscopy, nuclear magnetic resonance imaging, stimulated Raman scattering, directional gamma ray sensing, welding diagnostics and brain connectivity studies (connectomics). This review gives practical advice for implementing the technique and interpreting the results, including its limitations and instrumental constraints. It also summarises recent theoretical studies, highlights unsolved problems and outlines a personal view on the most promising research directions.
Nonparametric Bayes Factors Based On Empirical Likelihood Ratios
Vexler, Albert; Deng, Wei; Wilding, Gregory E.
2012-01-01
Bayes methodology provides posterior distribution functions based on parametric likelihoods adjusted for prior distributions. A distribution-free alternative to the parametric likelihood is use of empirical likelihood (EL) techniques, well known in the context of nonparametric testing of statistical hypotheses. Empirical likelihoods have been shown to exhibit many of the properties of conventional parametric likelihoods. In this article, we propose and examine Bayes factors (BF) methods that are derived via the EL ratio approach. Following Kass & Wasserman [10], we consider Bayes factors type decision rules in the context of standard statistical testing techniques. We show that the asymptotic properties of the proposed procedure are similar to the classical BF’s asymptotic operating characteristics. Although we focus on hypothesis testing, the proposed approach also yields confidence interval estimators of unknown parameters. Monte Carlo simulations were conducted to evaluate the theoretical results as well as to demonstrate the power of the proposed test. PMID:23180904
Covariance Applications with Kiwi
NASA Astrophysics Data System (ADS)
Mattoon, C. M.; Brown, D.; Elliott, J. B.
2012-05-01
The Computational Nuclear Physics group at Lawrence Livermore National Laboratory (LLNL) is developing a new tool, named `Kiwi', that is intended as an interface between the covariance data increasingly available in major nuclear reaction libraries (including ENDF and ENDL) and large-scale Uncertainty Quantification (UQ) studies. Kiwi is designed to integrate smoothly into large UQ studies, using the covariance matrix to generate multiple variations of nuclear data. The code has been tested using critical assemblies as a test case, and is being integrated into LLNL's quality assurance and benchmarking for nuclear data.
Comparing Smoothing Techniques for Fitting the Nonlinear Effect of Covariate in Cox Models.
Roshani, Daem; Ghaderi, Ebrahim
2016-02-01
Cox model is a popular model in survival analysis, which assumes linearity of the covariate on the log hazard function, While continuous covariates can affect the hazard through more complicated nonlinear functional forms and therefore, Cox models with continuous covariates are prone to misspecification due to not fitting the correct functional form for continuous covariates. In this study, a smooth nonlinear covariate effect would be approximated by different spline functions. We applied three flexible nonparametric smoothing techniques for nonlinear covariate effect in the Cox models: penalized splines, restricted cubic splines and natural splines. Akaike information criterion (AIC) and degrees of freedom were used to smoothing parameter selection in penalized splines model. The ability of nonparametric methods was evaluated to recover the true functional form of linear, quadratic and nonlinear functions, using different simulated sample sizes. Data analysis was carried out using R 2.11.0 software and significant levels were considered 0.05. Based on AIC, the penalized spline method had consistently lower mean square error compared to others to selection of smoothed parameter. The same result was obtained with real data. Penalized spline smoothing method, with AIC to smoothing parameter selection, was more accurate in evaluate of relation between covariate and log hazard function than other methods.
Comparing Smoothing Techniques for Fitting the Nonlinear Effect of Covariate in Cox Models
Roshani, Daem; Ghaderi, Ebrahim
2016-01-01
Background and Objective: Cox model is a popular model in survival analysis, which assumes linearity of the covariate on the log hazard function, While continuous covariates can affect the hazard through more complicated nonlinear functional forms and therefore, Cox models with continuous covariates are prone to misspecification due to not fitting the correct functional form for continuous covariates. In this study, a smooth nonlinear covariate effect would be approximated by different spline functions. Material and Methods: We applied three flexible nonparametric smoothing techniques for nonlinear covariate effect in the Cox models: penalized splines, restricted cubic splines and natural splines. Akaike information criterion (AIC) and degrees of freedom were used to smoothing parameter selection in penalized splines model. The ability of nonparametric methods was evaluated to recover the true functional form of linear, quadratic and nonlinear functions, using different simulated sample sizes. Data analysis was carried out using R 2.11.0 software and significant levels were considered 0.05. Results: Based on AIC, the penalized spline method had consistently lower mean square error compared to others to selection of smoothed parameter. The same result was obtained with real data. Conclusion: Penalized spline smoothing method, with AIC to smoothing parameter selection, was more accurate in evaluate of relation between covariate and log hazard function than other methods. PMID:27041809
Incorporating covariates in skewed functional data models.
Li, Meng; Staicu, Ana-Maria; Bondell, Howard D
2015-07-01
We introduce a class of covariate-adjusted skewed functional models (cSFM) designed for functional data exhibiting location-dependent marginal distributions. We propose a semi-parametric copula model for the pointwise marginal distributions, which are allowed to depend on covariates, and the functional dependence, which is assumed covariate invariant. The proposed cSFM framework provides a unifying platform for pointwise quantile estimation and trajectory prediction. We consider a computationally feasible procedure that handles densely as well as sparsely observed functional data. The methods are examined numerically using simulations and is applied to a new tractography study of multiple sclerosis. Furthermore, the methodology is implemented in the R package cSFM, which is publicly available on CRAN.
AFCI-2.0 Library of Neutron Cross Section Covariances
Herman, M.; Herman,M.; Oblozinsky,P.; Mattoon,C.; Pigni,M.; Hoblit,S.; Mughabghab,S.F.; Sonzogni,A.; Talou,P.; Chadwick,M.B.; Hale.G.M.; Kahler,A.C.; Kawano,T.; Little,R.C.; Young,P.G.
2011-06-26
Neutron cross section covariance library has been under development by BNL-LANL collaborative effort over the last three years. The primary purpose of the library is to provide covariances for the Advanced Fuel Cycle Initiative (AFCI) data adjustment project, which is focusing on the needs of fast advanced burner reactors. The covariances refer to central values given in the 2006 release of the U.S. neutron evaluated library ENDF/B-VII. The preliminary version (AFCI-2.0beta) has been completed in October 2010 and made available to the users for comments. In the final 2.0 release, covariances for a few materials were updated, in particular new LANL evaluations for {sup 238,240}Pu and {sup 241}Am were adopted. BNL was responsible for covariances for structural materials and fission products, management of the library and coordination of the work, while LANL was in charge of covariances for light nuclei and for actinides.
Generalized Linear Covariance Analysis
NASA Technical Reports Server (NTRS)
Carpenter, J. Russell; Markley, F. Landis
2008-01-01
We review and extend in two directions the results of prior work on generalized covariance analysis methods. This prior work allowed for partitioning of the state space into "solve-for" and "consider" parameters, allowed for differences between the formal values and the true values of the measurement noise, process noise, and a priori solve-for and consider covariances, and explicitly partitioned the errors into subspaces containing only the influence of the measurement noise, process noise, and a priori solve-for and consider covariances. In this work, we explicitly add sensitivity analysis to this prior work, and relax an implicit assumption that the batch estimator s anchor time occurs prior to the definitive span. We also apply the method to an integrated orbit and attitude problem, in which gyro and accelerometer errors, though not estimated, influence the orbit determination performance. We illustrate our results using two graphical presentations, which we call the "variance sandpile" and the "sensitivity mosaic," and we compare the linear covariance results to confidence intervals associated with ensemble statistics from a Monte Carlo analysis.
Covariant canonical superstrings
Aratyn, H.; Ingermanson, R.; Niemi, A.J.
1987-12-01
A covariant canonical formulation of generic superstrings is presented. The (super)geometry emerges dynamically and supergravity transformations are identified with particular canonical transformations. By construction these transformations are off-shell closed, and the necessary auxiliary fields can be identified with canonical momenta.
Generalized Linear Covariance Analysis
NASA Technical Reports Server (NTRS)
Carpenter, James R.; Markley, F. Landis
2014-01-01
This talk presents a comprehensive approach to filter modeling for generalized covariance analysis of both batch least-squares and sequential estimators. We review and extend in two directions the results of prior work that allowed for partitioning of the state space into solve-for'' and consider'' parameters, accounted for differences between the formal values and the true values of the measurement noise, process noise, and textita priori solve-for and consider covariances, and explicitly partitioned the errors into subspaces containing only the influence of the measurement noise, process noise, and solve-for and consider covariances. In this work, we explicitly add sensitivity analysis to this prior work, and relax an implicit assumption that the batch estimator's epoch time occurs prior to the definitive span. We also apply the method to an integrated orbit and attitude problem, in which gyro and accelerometer errors, though not estimated, influence the orbit determination performance. We illustrate our results using two graphical presentations, which we call the variance sandpile'' and the sensitivity mosaic,'' and we compare the linear covariance results to confidence intervals associated with ensemble statistics from a Monte Carlo analysis.
A Comparison of Bias Correction Adjustments for the DETECT Procedure
ERIC Educational Resources Information Center
Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei
2011-01-01
DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…
A Comparison of Bias Correction Adjustments for the DETECT Procedure
ERIC Educational Resources Information Center
Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei
2011-01-01
DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
An Evaluation of Parametric and Nonparametric Models of Fish Population Response.
Haas, Timothy C.; Peterson, James T.; Lee, Danny C.
1999-11-01
Predicting the distribution or status of animal populations at large scales often requires the use of broad-scale information describing landforms, climate, vegetation, etc. These data, however, often consist of mixtures of continuous and categorical covariates and nonmultiplicative interactions among covariates, complicating statistical analyses. Using data from the interior Columbia River Basin, USA, we compared four methods for predicting the distribution of seven salmonid taxa using landscape information. Subwatersheds (mean size, 7800 ha) were characterized using a set of 12 covariates describing physiography, vegetation, and current land-use. The techniques included generalized logit modeling, classification trees, a nearest neighbor technique, and a modular neural network. We evaluated model performance using out-of-sample prediction accuracy via leave-one-out cross-validation and introduce a computer-intensive Monte Carlo hypothesis testing approach for examining the statistical significance of landscape covariates with the non-parametric methods. We found the modular neural network and the nearest-neighbor techniques to be the most accurate, but were difficult to summarize in ways that provided ecological insight. The modular neural network also required the most extensive computer resources for model fitting and hypothesis testing. The generalized logit models were readily interpretable, but were the least accurate, possibly due to nonlinear relationships and nonmultiplicative interactions among covariates. Substantial overlap among the statistically significant (P<0.05) covariates for each method suggested that each is capable of detecting similar relationships between responses and covariates. Consequently, we believe that employing one or more methods may provide greater biological insight without sacrificing prediction accuracy.
Bayesian Nonparametric Estimation for Dynamic Treatment Regimes with Sequential Transition Times.
Xu, Yanxun; Müller, Peter; Wahed, Abdus S; Thall, Peter F
2016-01-01
We analyze a dataset arising from a clinical trial involving multi-stage chemotherapy regimes for acute leukemia. The trial design was a 2 × 2 factorial for frontline therapies only. Motivated by the idea that subsequent salvage treatments affect survival time, we model therapy as a dynamic treatment regime (DTR), that is, an alternating sequence of adaptive treatments or other actions and transition times between disease states. These sequences may vary substantially between patients, depending on how the regime plays out. To evaluate the regimes, mean overall survival time is expressed as a weighted average of the means of all possible sums of successive transitions times. We assume a Bayesian nonparametric survival regression model for each transition time, with a dependent Dirichlet process prior and Gaussian process base measure (DDP-GP). Posterior simulation is implemented by Markov chain Monte Carlo (MCMC) sampling. We provide general guidelines for constructing a prior using empirical Bayes methods. The proposed approach is compared with inverse probability of treatment weighting, including a doubly robust augmented version of this approach, for both single-stage and multi-stage regimes with treatment assignment depending on baseline covariates. The simulations show that the proposed nonparametric Bayesian approach can substantially improve inference compared to existing methods. An R program for implementing the DDP-GP-based Bayesian nonparametric analysis is freely available at https://www.ma.utexas.edu/users/yxu/.
A complete graphical criterion for the adjustment formula in mediation analysis.
Shpitser, Ilya; VanderWeele, Tyler J
2011-03-04
Various assumptions have been used in the literature to identify natural direct and indirect effects in mediation analysis. These effects are of interest because they allow for effect decomposition of a total effect into a direct and indirect effect even in the presence of interactions or non-linear models. In this paper, we consider the relation and interpretation of various identification assumptions in terms of causal diagrams interpreted as a set of non-parametric structural equations. We show that for such causal diagrams, two sets of assumptions for identification that have been described in the literature are in fact equivalent in the sense that if either set of assumptions holds for all models inducing a particular causal diagram, then the other set of assumptions will also hold for all models inducing that diagram. We moreover build on prior work concerning a complete graphical identification criterion for covariate adjustment for total effects to provide a complete graphical criterion for using covariate adjustment to identify natural direct and indirect effects. Finally, we show that this criterion is equivalent to the two sets of independence assumptions used previously for mediation analysis.
Covariant approximation averaging
NASA Astrophysics Data System (ADS)
Shintani, Eigo; Arthur, Rudy; Blum, Thomas; Izubuchi, Taku; Jung, Chulwoo; Lehner, Christoph
2015-06-01
We present a new class of statistical error reduction techniques for Monte Carlo simulations. Using covariant symmetries, we show that correlation functions can be constructed from inexpensive approximations without introducing any systematic bias in the final result. We introduce a new class of covariant approximation averaging techniques, known as all-mode averaging (AMA), in which the approximation takes account of contributions of all eigenmodes through the inverse of the Dirac operator computed from the conjugate gradient method with a relaxed stopping condition. In this paper we compare the performance and computational cost of our new method with traditional methods using correlation functions and masses of the pion, nucleon, and vector meson in Nf=2 +1 lattice QCD using domain-wall fermions. This comparison indicates that AMA significantly reduces statistical errors in Monte Carlo calculations over conventional methods for the same cost.
A New Approach for Nuclear Data Covariance and Sensitivity Generation
Leal, L.C.; Larson, N.M.; Derrien, H.; Kawano, T.; Chadwick, M.B.
2005-05-24
Covariance data are required to correctly assess uncertainties in design parameters in nuclear applications. The error estimation of calculated quantities relies on the nuclear data uncertainty information available in the basic nuclear data libraries, such as the U.S. Evaluated Nuclear Data File, ENDF/B. The uncertainty files in the ENDF/B library are obtained from the analysis of experimental data and are stored as variance and covariance data. The computer code SAMMY is used in the analysis of the experimental data in the resolved and unresolved resonance energy regions. The data fitting of cross sections is based on generalized least-squares formalism (Bayes' theory) together with the resonance formalism described by R-matrix theory. Two approaches are used in SAMMY for the generation of resonance-parameter covariance data. In the evaluation process SAMMY generates a set of resonance parameters that fit the data, and, in addition, it also provides the resonance-parameter covariances. For existing resonance-parameter evaluations where no resonance-parameter covariance data are available, the alternative is to use an approach called the 'retroactive' resonance-parameter covariance generation. In the high-energy region the methodology for generating covariance data consists of least-squares fitting and model parameter adjustment. The least-squares fitting method calculates covariances directly from experimental data. The parameter adjustment method employs a nuclear model calculation such as the optical model and the Hauser-Feshbach model, and estimates a covariance for the nuclear model parameters. In this paper we describe the application of the retroactive method and the parameter adjustment method to generate covariance data for the gadolinium isotopes.
Using Analysis of Covariance (ANCOVA) with Fallible Covariates
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Aguinis, Herman
2011-01-01
Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but…
Using Analysis of Covariance (ANCOVA) with Fallible Covariates
ERIC Educational Resources Information Center
Culpepper, Steven Andrew; Aguinis, Herman
2011-01-01
Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but…
Covariant deformed oscillator algebras
NASA Technical Reports Server (NTRS)
Quesne, Christiane
1995-01-01
The general form and associativity conditions of deformed oscillator algebras are reviewed. It is shown how the latter can be fulfilled in terms of a solution of the Yang-Baxter equation when this solution has three distinct eigenvalues and satisfies a Birman-Wenzl-Murakami condition. As an example, an SU(sub q)(n) x SU(sub q)(m)-covariant q-bosonic algebra is discussed in some detail.
Khondker, Zakaria S; Zhu, Hongtu; Chu, Haitao; Lin, Weili; Ibrahim, Joseph G.
2012-01-01
Estimation of sparse covariance matrices and their inverse subject to positive definiteness constraints has drawn a lot of attention in recent years. The abundance of high-dimensional data, where the sample size (n) is less than the dimension (d), requires shrinkage estimation methods since the maximum likelihood estimator is not positive definite in this case. Furthermore, when n is larger than d but not sufficiently larger, shrinkage estimation is more stable than maximum likelihood as it reduces the condition number of the precision matrix. Frequentist methods have utilized penalized likelihood methods, whereas Bayesian approaches rely on matrix decompositions or Wishart priors for shrinkage. In this paper we propose a new method, called the Bayesian Covariance Lasso (BCLASSO), for the shrinkage estimation of a precision (covariance) matrix. We consider a class of priors for the precision matrix that leads to the popular frequentist penalties as special cases, develop a Bayes estimator for the precision matrix, and propose an efficient sampling scheme that does not precalculate boundaries for positive definiteness. The proposed method is permutation invariant and performs shrinkage and estimation simultaneously for non-full rank data. Simulations show that the proposed BCLASSO performs similarly as frequentist methods for non-full rank data. PMID:24551316
Sharpening bounds on principal effects with covariates.
Long, Dustin M; Hudgens, Michael G
2013-12-01
Estimation of treatment effects in randomized studies is often hampered by possible selection bias induced by conditioning on or adjusting for a variable measured post-randomization. One approach to obviate such selection bias is to consider inference about treatment effects within principal strata, that is, principal effects. A challenge with this approach is that without strong assumptions principal effects are not identifiable from the observable data. In settings where such assumptions are dubious, identifiable large sample bounds may be the preferred target of inference. In practice these bounds may be wide and not particularly informative. In this work we consider whether bounds on principal effects can be improved by adjusting for a categorical baseline covariate. Adjusted bounds are considered which are shown to never be wider than the unadjusted bounds. Necessary and sufficient conditions are given for which the adjusted bounds will be sharper (i.e., narrower) than the unadjusted bounds. The methods are illustrated using data from a recent, large study of interventions to prevent mother-to-child transmission of HIV through breastfeeding. Using a baseline covariate indicating low birth weight, the estimated adjusted bounds for the principal effect of interest are 63% narrower than the estimated unadjusted bounds. © 2013, The International Biometric Society.
Sample size determination for the confidence interval of linear contrast in analysis of covariance.
Liu, Xiaofeng Steven
2013-03-11
This article provides a way to determine sample size for the confidence interval of the linear contrast of treatment means in analysis of covariance (ANCOVA) without prior knowledge of the actual covariate means and covariate sum of squares, which are modeled as a t statistic. Using the t statistic, one can calculate the appropriate sample size to achieve the desired probability of obtaining a specified width in the confidence interval of the covariate-adjusted linear contrast.
Estimating the extreme low-temperature event using nonparametric methods
NASA Astrophysics Data System (ADS)
D'Silva, Anisha
This thesis presents a new method of estimating the one-in-N low temperature threshold using a non-parametric statistical method called kernel density estimation applied to daily average wind-adjusted temperatures. We apply our One-in-N Algorithm to local gas distribution companies (LDCs), as they have to forecast the daily natural gas needs of their consumers. In winter, demand for natural gas is high. Extreme low temperature events are not directly related to an LDCs gas demand forecasting, but knowledge of extreme low temperatures is important to ensure that an LDC has enough capacity to meet customer demands when extreme low temperatures are experienced. We present a detailed explanation of our One-in-N Algorithm and compare it to the methods using the generalized extreme value distribution, the normal distribution, and the variance-weighted composite distribution. We show that our One-in-N Algorithm estimates the one-in- N low temperature threshold more accurately than the methods using the generalized extreme value distribution, the normal distribution, and the variance-weighted composite distribution according to root mean square error (RMSE) measure at a 5% level of significance. The One-in- N Algorithm is tested by counting the number of times the daily average wind-adjusted temperature is less than or equal to the one-in- N low temperature threshold.
Inadequacy of internal covariance estimation for super-sample covariance
NASA Astrophysics Data System (ADS)
Lacasa, Fabien; Kunz, Martin
2017-08-01
We give an analytical interpretation of how subsample-based internal covariance estimators lead to biased estimates of the covariance, due to underestimating the super-sample covariance (SSC). This includes the jackknife and bootstrap methods as estimators for the full survey area, and subsampling as an estimator of the covariance of subsamples. The limitations of the jackknife covariance have been previously presented in the literature because it is effectively a rescaling of the covariance of the subsample area. However we point out that subsampling is also biased, but for a different reason: the subsamples are not independent, and the corresponding lack of power results in SSC underprediction. We develop the formalism in the case of cluster counts that allows the bias of each covariance estimator to be exactly predicted. We find significant effects for a small-scale area or when a low number of subsamples is used, with auto-redshift biases ranging from 0.4% to 15% for subsampling and from 5% to 75% for jackknife covariance estimates. The cross-redshift covariance is even more affected; biases range from 8% to 25% for subsampling and from 50% to 90% for jackknife. Owing to the redshift evolution of the probe, the covariances cannot be debiased by a simple rescaling factor, and an exact debiasing has the same requirements as the full SSC prediction. These results thus disfavour the use of internal covariance estimators on data itself or a single simulation, leaving analytical prediction and simulations suites as possible SSC predictors.
Lottery spending: a non-parametric analysis.
Garibaldi, Skip; Frisoli, Kayla; Ke, Li; Lim, Melody
2015-01-01
We analyze the spending of individuals in the United States on lottery tickets in an average month, as reported in surveys. We view these surveys as sampling from an unknown distribution, and we use non-parametric methods to compare properties of this distribution for various demographic groups, as well as claims that some properties of this distribution are constant across surveys. We find that the observed higher spending by Hispanic lottery players can be attributed to differences in education levels, and we dispute previous claims that the top 10% of lottery players consistently account for 50% of lottery sales.
Bayesian Nonparametric Inference – Why and How
Müller, Peter; Mitra, Riten
2013-01-01
We review inference under models with nonparametric Bayesian (BNP) priors. The discussion follows a set of examples for some common inference problems. The examples are chosen to highlight problems that are challenging for standard parametric inference. We discuss inference for density estimation, clustering, regression and for mixed effects models with random effects distributions. While we focus on arguing for the need for the flexibility of BNP models, we also review some of the more commonly used BNP models, thus hopefully answering a bit of both questions, why and how to use BNP. PMID:24368932
A nonparametric and diversified portfolio model
NASA Astrophysics Data System (ADS)
Shirazi, Yasaman Izadparast; Sabiruzzaman, Md.; Hamzah, Nor Aishah
2014-07-01
Traditional portfolio models, like mean-variance (MV) suffer from estimation error and lack of diversity. Alternatives, like mean-entropy (ME) or mean-variance-entropy (MVE) portfolio models focus independently on the issue of either a proper risk measure or the diversity. In this paper, we propose an asset allocation model that compromise between risk of historical data and future uncertainty. In the new model, entropy is presented as a nonparametric risk measure as well as an index of diversity. Our empirical evaluation with a variety of performance measures shows that this model has better out-of-sample performances and lower portfolio turnover than its competitors.
Asymptotic Theory for Nonparametric Confidence Intervals.
1982-07-01
distributions. Ann. Math Statist. 14, 56-62. 24. ROY, S.N. and POTTHOFF, R.F. (1958). Confidence bounds on vector analogues of the "ratio of the mean" and...fl c,~_________ 14L TITLE feed &MV) S. TYPE or REPORT a PeftOo COVx:REC Asympeocic Theory for Nonaparuetric Technical Report Confidence Intevals 6...S..C-0S78 UNCLASSIFIED TŗU *uuuuumuuumhhhhmhhhm_4 ASYMPTOTIC THEORY FOR NONPARAMETRIC CONFIDENCE INTERVALS by Peter W. Glynn TECHNICAL REPORT NO. 63
Lottery Spending: A Non-Parametric Analysis
Garibaldi, Skip; Frisoli, Kayla; Ke, Li; Lim, Melody
2015-01-01
We analyze the spending of individuals in the United States on lottery tickets in an average month, as reported in surveys. We view these surveys as sampling from an unknown distribution, and we use non-parametric methods to compare properties of this distribution for various demographic groups, as well as claims that some properties of this distribution are constant across surveys. We find that the observed higher spending by Hispanic lottery players can be attributed to differences in education levels, and we dispute previous claims that the top 10% of lottery players consistently account for 50% of lottery sales. PMID:25642699
Optimal covariant quantum networks
NASA Astrophysics Data System (ADS)
Chiribella, Giulio; D'Ariano, Giacomo Mauro; Perinotti, Paolo
2009-04-01
A sequential network of quantum operations is efficiently described by its quantum comb [1], a non-negative operator with suitable normalization constraints. Here we analyze the case of networks enjoying symmetry with respect to the action of a given group of physical transformations, introducing the notion of covariant combs and testers, and proving the basic structure theorems for these objects. As an application, we discuss the optimal alignment of reference frames (without pre-established common references) with multiple rounds of quantum communication, showing that i) allowing an arbitrary amount of classical communication does not improve the alignment, and ii) a single round of quantum communication is sufficient.
Covariant magnetic connection hypersurfaces
NASA Astrophysics Data System (ADS)
Pegoraro, F.
2016-04-01
> In the single fluid, non-relativistic, ideal magnetohydrodynamic (MHD) plasma description, magnetic field lines play a fundamental role by defining dynamically preserved `magnetic connections' between plasma elements. Here we show how the concept of magnetic connection needs to be generalized in the case of a relativistic MHD description where we require covariance under arbitrary Lorentz transformations. This is performed by defining 2-D magnetic connection hypersurfaces in the 4-D Minkowski space. This generalization accounts for the loss of simultaneity between spatially separated events in different frames and is expected to provide a powerful insight into the 4-D geometry of electromagnetic fields when .
Earth Observing System Covariance Realism
NASA Technical Reports Server (NTRS)
Zaidi, Waqar H.; Hejduk, Matthew D.
2016-01-01
The purpose of covariance realism is to properly size a primary object's covariance in order to add validity to the calculation of the probability of collision. The covariance realism technique in this paper consists of three parts: collection/calculation of definitive state estimates through orbit determination, calculation of covariance realism test statistics at each covariance propagation point, and proper assessment of those test statistics. An empirical cumulative distribution function (ECDF) Goodness-of-Fit (GOF) method is employed to determine if a covariance is properly sized by comparing the empirical distribution of Mahalanobis distance calculations to the hypothesized parent 3-DoF chi-squared distribution. To realistically size a covariance for collision probability calculations, this study uses a state noise compensation algorithm that adds process noise to the definitive epoch covariance to account for uncertainty in the force model. Process noise is added until the GOF tests pass a group significance level threshold. The results of this study indicate that when outliers attributed to persistently high or extreme levels of solar activity are removed, the aforementioned covariance realism compensation method produces a tuned covariance with up to 80 to 90% of the covariance propagation timespan passing (against a 60% minimum passing threshold) the GOF tests-a quite satisfactory and useful result.
Nonparametric Bayes Stochastically Ordered Latent Class Models
Yang, Hongxia; O’Brien, Sean; Dunson, David B.
2012-01-01
Latent class models (LCMs) are used increasingly for addressing a broad variety of problems, including sparse modeling of multivariate and longitudinal data, model-based clustering, and flexible inferences on predictor effects. Typical frequentist LCMs require estimation of a single finite number of classes, which does not increase with the sample size, and have a well-known sensitivity to parametric assumptions on the distributions within a class. Bayesian nonparametric methods have been developed to allow an infinite number of classes in the general population, with the number represented in a sample increasing with sample size. In this article, we propose a new nonparametric Bayes model that allows predictors to flexibly impact the allocation to latent classes, while limiting sensitivity to parametric assumptions by allowing class-specific distributions to be unknown subject to a stochastic ordering constraint. An efficient MCMC algorithm is developed for posterior computation. The methods are validated using simulation studies and applied to the problem of ranking medical procedures in terms of the distribution of patient morbidity. PMID:22505787
Bayesian Nonparametric Clustering for Positive Definite Matrices.
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2016-05-01
Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms.
Nonparametric Scene Parsing via Label Transfer.
Liu, Ce; Yuen, Jenny; Torralba, Antonio
2011-12-01
While there has been a lot of recent work on object recognition and image understanding, the focus has been on carefully establishing mathematical models for images, scenes, and objects. In this paper, we propose a novel, nonparametric approach for object recognition and scene parsing using a new technology we name label transfer. For an input image, our system first retrieves its nearest neighbors from a large database containing fully annotated images. Then, the system establishes dense correspondences between the input image and each of the nearest neighbors using the dense SIFT flow algorithm [28], which aligns two images based on local image structures. Finally, based on the dense scene correspondences obtained from SIFT flow, our system warps the existing annotations and integrates multiple cues in a Markov random field framework to segment and recognize the query image. Promising experimental results have been achieved by our nonparametric scene parsing system on challenging databases. Compared to existing object recognition approaches that require training classifiers or appearance models for each object category, our system is easy to implement, has few parameters, and embeds contextual information naturally in the retrieval/alignment procedure.
Decision boundary feature selection for non-parametric classifier
NASA Technical Reports Server (NTRS)
Lee, Chulhee; Landgrebe, David A.
1991-01-01
Feature selection has been one of the most important topics in pattern recognition. Although many authors have studied feature selection for parametric classifiers, few algorithms are available for feature selection for nonparametric classifiers. In this paper we propose a new feature selection algorithm based on decision boundaries for nonparametric classifiers. We first note that feature selection for pattern recognition is equivalent to retaining 'discriminantly informative features', and a discriminantly informative feature is related to the decision boundary. A procedure to extract discriminantly informative features based on a decision boundary for nonparametric classification is proposed. Experiments show that the proposed algorithm finds effective features for the nonparametric classifier with Parzen density estimation.
SCALE-6 Sensitivity/Uncertainty Methods and Covariance Data
NASA Astrophysics Data System (ADS)
Williams, M. L.; Rearden, B. T.
2008-12-01
Computational methods and data used for sensitivity and uncertainty analysis within the SCALE nuclear analysis code system are presented. The methodology used to calculate sensitivity coefficients and similarity coefficients and to perform nuclear data adjustment is discussed. A description is provided of the SCALE-6 covariance library based on ENDF/B-VII and other nuclear data evaluations, supplemented by "low-fidelity" approximate covariances.
Cadarso-Suárez, Carmen; Roca-Pardiñas, Javier; Figueiras, Adolfo; González-Manteiga, Wenceslao
2005-04-30
The generalized additive, model (GAM) is a powerful and widely used tool that allows researchers to fit, non-parametrically, the effect of continuous predictors on a transformation of the mean response variable. Such a transformation is given by a so-called link function, and in GAMs this link function is assumed to be known. Nevertheless, if an incorrect choice is made for the link, the resulting GAM is misspecified and the results obtained may be misleading. In this paper, we propose a modified version of the local scoring algorithm that allows for the non-parametric estimation of the link function, by using local linear kernel smoothers. To better understand the effect that each covariate produces on the outcome, results are expressed in terms of the non-parametric odds ratio (OR) curves. Bootstrap techniques were used to correct the bias in the OR estimation and to construct point-wise confidence intervals. A simulation study was carried out to assess the behaviour of the resulting estimates. The proposed methodology was illustrated using data from the AIDS Register of Galicia (NW Spain), with a view to assessing the effect of the CD4 lymphocyte count on the probability of being AIDS-diagnosed via Tuberculosis (TB). This application shows how the link's flexibility makes it possible to obtain OR curve estimates that are less sensitive to the presence of outliers and unusual values that are often present in the extremes of the covariate distributions.
A Bayesian Nonparametric Meta-Analysis Model
ERIC Educational Resources Information Center
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
A Bayesian Nonparametric Meta-Analysis Model
ERIC Educational Resources Information Center
Karabatsos, George; Talbott, Elizabeth; Walker, Stephen G.
2015-01-01
In a meta-analysis, it is important to specify a model that adequately describes the effect-size distribution of the underlying population of studies. The conventional normal fixed-effect and normal random-effects models assume a normal effect-size population distribution, conditionally on parameters and covariates. For estimating the mean overall…
Low-Fidelity Covariances: Neutron Cross Section Covariance Estimates for 387 Materials
The Low-fidelity Covariance Project (Low-Fi) was funded in FY07-08 by DOEÆs Nuclear Criticality Safety Program (NCSP). The project was a collaboration among ANL, BNL, LANL, and ORNL. The motivation for the Low-Fi project stemmed from an imbalance in supply and demand of covariance data. The interest in, and demand for, covariance data has been in a continual uptrend over the past few years. Requirements to understand application-dependent uncertainties in simulated quantities of interest have led to the development of sensitivity / uncertainty and data adjustment software such as TSUNAMI [1] at Oak Ridge. To take full advantage of the capabilities of TSUNAMI requires general availability of covariance data. However, the supply of covariance data has not been able to keep up with the demand. This fact is highlighted by the observation that the recent release of the much-heralded ENDF/B-VII.0 included covariance data for only 26 of the 393 neutron evaluations (which is, in fact, considerably less covariance data than was included in the final ENDF/B-VI release).[Copied from R.C. Little et al., "Low-Fidelity Covariance Project", Nuclear Data Sheets 109 (2008) 2828-2833] The Low-Fi covariance data are now available at the National Nuclear Data Center. They are separate from ENDF/B-VII.0 and the NNDC warns that this information is not approved by CSEWG. NNDC describes the contents of this collection as: "Covariance data are provided for radiative capture (or (n,ch.p.) for light nuclei), elastic scattering (or total for some actinides), inelastic scattering, (n,2n) reactions, fission and nubars over the energy range from 10(-5{super}) eV to 20 MeV. The library contains 387 files including almost all (383 out of 393) materials of the ENDF/B-VII.0. Absent are data for (7{super})Li, (232{super})Th, (233,235,238{super})U and (239{super})Pu as well as (223,224,225,226{super})Ra, while (nat{super})Zn is replaced by (64,66,67,68,70{super})Zn
Stardust Navigation Covariance Analysis
NASA Technical Reports Server (NTRS)
Menon, Premkumar R.
2000-01-01
The Stardust spacecraft was launched on February 7, 1999 aboard a Boeing Delta-II rocket. Mission participants include the National Aeronautics and Space Administration (NASA), the Jet Propulsion Laboratory (JPL), Lockheed Martin Astronautics (LMA) and the University of Washington. The primary objective of the mission is to collect in-situ samples of the coma of comet Wild-2 and return those samples to the Earth for analysis. Mission design and operational navigation for Stardust is performed by the Jet Propulsion Laboratory (JPL). This paper will describe the extensive JPL effort in support of the Stardust pre-launch analysis of the orbit determination component of the mission covariance study. A description of the mission and it's trajectory will be provided first, followed by a discussion of the covariance procedure and models. Predicted accuracy's will be examined as they relate to navigation delivery requirements for specific critical events during the mission. Stardust was launched into a heliocentric trajectory in early 1999. It will perform an Earth Gravity Assist (EGA) on January 15, 2001 to acquire an orbit for the eventual rendezvous with comet Wild-2. The spacecraft will fly through the coma (atmosphere) on the dayside of Wild-2 on January 2, 2004. At that time samples will be obtained using an aerogel collector. After the comet encounter Stardust will return to Earth when the Sample Return Capsule (SRC) will separate and land at the Utah Test Site (UTTR) on January 15, 2006. The spacecraft will however be deflected off into a heliocentric orbit. The mission is divided into three phases for the covariance analysis. They are 1) Launch to EGA, 2) EGA to Wild-2 encounter and 3) Wild-2 encounter to Earth reentry. Orbit determination assumptions for each phase are provided. These include estimated and consider parameters and their associated a-priori uncertainties. Major perturbations to the trajectory include 19 deterministic and statistical maneuvers
Deriving covariant holographic entanglement
NASA Astrophysics Data System (ADS)
Dong, Xi; Lewkowycz, Aitor; Rangamani, Mukund
2016-11-01
We provide a gravitational argument in favour of the covariant holographic entanglement entropy proposal. In general time-dependent states, the proposal asserts that the entanglement entropy of a region in the boundary field theory is given by a quarter of the area of a bulk extremal surface in Planck units. The main element of our discussion is an implementation of an appropriate Schwinger-Keldysh contour to obtain the reduced density matrix (and its powers) of a given region, as is relevant for the replica construction. We map this contour into the bulk gravitational theory, and argue that the saddle point solutions of these replica geometries lead to a consistent prescription for computing the field theory Rényi entropies. In the limiting case where the replica index is taken to unity, a local analysis suffices to show that these saddles lead to the extremal surfaces of interest. We also comment on various properties of holographic entanglement that follow from this construction.
Stardust Navigation Covariance Analysis
NASA Technical Reports Server (NTRS)
Menon, Premkumar R.
2000-01-01
The Stardust spacecraft was launched on February 7, 1999 aboard a Boeing Delta-II rocket. Mission participants include the National Aeronautics and Space Administration (NASA), the Jet Propulsion Laboratory (JPL), Lockheed Martin Astronautics (LMA) and the University of Washington. The primary objective of the mission is to collect in-situ samples of the coma of comet Wild-2 and return those samples to the Earth for analysis. Mission design and operational navigation for Stardust is performed by the Jet Propulsion Laboratory (JPL). This paper will describe the extensive JPL effort in support of the Stardust pre-launch analysis of the orbit determination component of the mission covariance study. A description of the mission and it's trajectory will be provided first, followed by a discussion of the covariance procedure and models. Predicted accuracy's will be examined as they relate to navigation delivery requirements for specific critical events during the mission. Stardust was launched into a heliocentric trajectory in early 1999. It will perform an Earth Gravity Assist (EGA) on January 15, 2001 to acquire an orbit for the eventual rendezvous with comet Wild-2. The spacecraft will fly through the coma (atmosphere) on the dayside of Wild-2 on January 2, 2004. At that time samples will be obtained using an aerogel collector. After the comet encounter Stardust will return to Earth when the Sample Return Capsule (SRC) will separate and land at the Utah Test Site (UTTR) on January 15, 2006. The spacecraft will however be deflected off into a heliocentric orbit. The mission is divided into three phases for the covariance analysis. They are 1) Launch to EGA, 2) EGA to Wild-2 encounter and 3) Wild-2 encounter to Earth reentry. Orbit determination assumptions for each phase are provided. These include estimated and consider parameters and their associated a-priori uncertainties. Major perturbations to the trajectory include 19 deterministic and statistical maneuvers
Chryssomalakos, Chryssomalis; Stephens, Christopher R
2007-01-01
We present a covariant form for the dynamics of a canonical GA of arbitrary cardinality, showing how each genetic operator can be uniquely represented by a mathematical object - a tensor - that transforms simply under a general linear coordinate transformation. For mutation and recombination these tensors can be written as tensor products of the analogous tensors for one-bit strings thus giving a greatly simplified formulation of the dynamics. We analyze the three most well known coordinate systems -- string, Walsh and Building Block - discussing their relative advantages and disadvantages with respect to the different operators, showing how one may transform from one to the other, and that the associated coordinate transformation matrices can be written as a tensor product of the corresponding one-bit matrices. We also show that in the Building Block basis the dynamical equations for all Building Blocks can be generated from the equation for the most fine-grained block (string) by a certain projection ("zapping").
NASA Astrophysics Data System (ADS)
Saltas, Ippocratis D.; Vitagliano, Vincenzo
2017-05-01
We derive the 1-loop effective action of the cubic Galileon coupled to quantum-gravitational fluctuations in a background and gauge-independent manner, employing the covariant framework of DeWitt and Vilkovisky. Although the bare action respects shift symmetry, the coupling to gravity induces an effective mass to the scalar, of the order of the cosmological constant, as a direct result of the nonflat field-space metric, the latter ensuring the field-reparametrization invariance of the formalism. Within a gauge-invariant regularization scheme, we discover novel, gravitationally induced non-Galileon higher-derivative interactions in the effective action. These terms, previously unnoticed within standard, noncovariant frameworks, are not Planck suppressed. Unless tuned to be subdominant, their presence could have important implications for the classical and quantum phenomenology of the theory.
Nonparametric ROC Based Evaluation for Survival Outcomes
Song, Xiao; Zhou, Xiao-Hua; Ma, Shuangge
2013-01-01
SUMMARY For censored survival outcomes, it can be of great interest to evaluate the predictive power of individual markers or their functions. Compared with alternative evaluation approaches, the time-dependent ROC (receiver operating characteristics) based approaches rely on much weaker assumptions, can be more robust, and hence are preferred. In this article, we examine evaluation of markers’ predictive power using the time-dependent ROC curve and a concordance measure which can be viewed as a weighted area under the time-dependent AUC (area under the ROC curve) profile. This study significantly advances from existing time-dependent ROC studies by developing nonparametric estimators of the summary indexes and, more importantly, rigorously establishing their asymptotic properties. It reinforces the statistical foundation of the time-dependent ROC based evaluation approaches for censored survival outcomes. Numerical studies, including simulations and application to an HIV clinical trial, demonstrate the satisfactory finite-sample performance of the proposed approaches. PMID:22987578
Nonparametric dark energy reconstruction from supernova data.
Holsclaw, Tracy; Alam, Ujjaini; Sansó, Bruno; Lee, Herbert; Heitmann, Katrin; Habib, Salman; Higdon, David
2010-12-10
Understanding the origin of the accelerated expansion of the Universe poses one of the greatest challenges in physics today. Lacking a compelling fundamental theory to test, observational efforts are targeted at a better characterization of the underlying cause. If a new form of mass-energy, dark energy, is driving the acceleration, the redshift evolution of the equation of state parameter w(z) will hold essential clues as to its origin. To best exploit data from observations it is necessary to develop a robust and accurate reconstruction approach, with controlled errors, for w(z). We introduce a new, nonparametric method for solving the associated statistical inverse problem based on Gaussian process modeling and Markov chain Monte Carlo sampling. Applying this method to recent supernova measurements, we reconstruct the continuous history of w out to redshift z=1.5.
Nonparametric spirometry reference values for Hispanic Americans.
Glenn, Nancy L; Brown, Vanessa M
2011-02-01
Recent literature sites ethnic origin as a major factor in developing pulmonary function reference values. Extensive studies established reference values for European and African Americans, but not for Hispanic Americans. The Third National Health and Nutrition Examination Survey defines Hispanic as individuals of Spanish speaking cultures. While no group was excluded from the target population, sample size requirements only allowed inclusion of individuals who identified themselves as Mexican Americans. This research constructs nonparametric reference value confidence intervals for Hispanic American pulmonary function. The method is applicable to all ethnicities. We use empirical likelihood confidence intervals to establish normal ranges for reference values. Its major advantage: it is model free, but shares asymptotic properties of model based methods. Statistical comparisons indicate that empirical likelihood interval lengths are comparable to normal theory intervals. Power and efficiency studies agree with previously published theoretical results.
Nonparametric k-nearest-neighbor entropy estimator.
Lombardi, Damiano; Pant, Sanjay
2016-01-01
A nonparametric k-nearest-neighbor-based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering nonuniform probability densities in the region of k-nearest neighbors around each sample point. It aims to improve the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator, and a significant improvement is demonstrated.
Kernel bandwidth estimation for nonparametric modeling.
Bors, Adrian G; Nasios, Nikolaos
2009-12-01
Kernel density estimation is a nonparametric procedure for probability density modeling, which has found several applications in various fields. The smoothness and modeling ability of the functional approximation are controlled by the kernel bandwidth. In this paper, we describe a Bayesian estimation method for finding the bandwidth from a given data set. The proposed bandwidth estimation method is applied in three different computational-intelligence methods that rely on kernel density estimation: 1) scale space; 2) mean shift; and 3) quantum clustering. The third method is a novel approach that relies on the principles of quantum mechanics. This method is based on the analogy between data samples and quantum particles and uses the SchrOdinger potential as a cost function. The proposed methodology is used for blind-source separation of modulated signals and for terrain segmentation based on topography information.
A nonparametric method for penetrance function estimation.
Alarcon, F; Bonaïti-Pellié, C; Harari-Kermadec, H
2009-01-01
In diseases caused by a deleterious gene mutation, knowledge of age-specific cumulative risks is necessary for medical management of mutation carriers. When pedigrees are ascertained through at least one affected individual, ascertainment bias can be corrected by using a parametric method such as the Proband's phenotype Exclusion Likelihood, or PEL, that uses a survival analysis approach based on the Weibull model. This paper proposes a nonparametric method for penetrance function estimation that corrects for ascertainment on at least one affected: the Index Discarding EuclideAn Likelihood or IDEAL. IDEAL is compared with PEL, using family samples simulated from a Weibull distribution and under alternative models. We show that, under Weibull assumption and asymptotic conditions, IDEAL and PEL both provide unbiased risk estimates. However, when the true risk function deviates from a Weibull distribution, we show that the PEL might provide biased estimates while IDEAL remains unbiased.
Nonparametric inference of network structure and dynamics
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
An Empirical Study of Eight Nonparametric Tests in Hierarchical Regression.
ERIC Educational Resources Information Center
Harwell, Michael; Serlin, Ronald C.
When normality does not hold, nonparametric tests represent an important data-analytic alternative to parametric tests. However, the use of nonparametric tests in educational research has been limited by the absence of easily performed tests for complex experimental designs and analyses, such as factorial designs and multiple regression analyses,…
General Galilei Covariant Gaussian Maps
NASA Astrophysics Data System (ADS)
Gasbarri, Giulio; Toroš, Marko; Bassi, Angelo
2017-09-01
We characterize general non-Markovian Gaussian maps which are covariant under Galilean transformations. In particular, we consider translational and Galilean covariant maps and show that they reduce to the known Holevo result in the Markovian limit. We apply the results to discuss measures of macroscopicity based on classicalization maps, specifically addressing dissipation, Galilean covariance and non-Markovianity. We further suggest a possible generalization of the macroscopicity measure defined by Nimmrichter and Hornberger [Phys. Rev. Lett. 110, 16 (2013)].
Covariance Manipulation for Conjunction Assessment
NASA Technical Reports Server (NTRS)
Hejduk, M. D.
2016-01-01
Use of probability of collision (Pc) has brought sophistication to CA. Made possible by JSpOC precision catalogue because provides covariance. Has essentially replaced miss distance as basic CA parameter. Embrace of Pc has elevated methods to 'manipulate' covariance to enable/improve CA calculations. Two such methods to be examined here; compensation for absent or unreliable covariances through 'Maximum Pc' calculation constructs, projection (not propagation) of epoch covariances forward in time to try to enable better risk assessments. Two questions to be answered about each; situations to which such approaches are properly applicable, amount of utility that such methods offer.
Adaptive Neural Network Nonparametric Identifier With Normalized Learning Laws.
Chairez, Isaac
2016-04-05
This paper addresses the design of a normalized convergent learning law for neural networks (NNs) with continuous dynamics. The NN is used here to obtain a nonparametric model for uncertain systems described by a set of ordinary differential equations. The source of uncertainties is the presence of some external perturbations and poor knowledge of the nonlinear function describing the system dynamics. A new adaptive algorithm based on normalized algorithms was used to adjust the weights of the NN. The adaptive algorithm was derived by means of a nonstandard logarithmic Lyapunov function (LLF). Two identifiers were designed using two variations of LLFs leading to a normalized learning law for the first identifier and a variable gain normalized learning law. In the case of the second identifier, the inclusion of normalized learning laws yields to reduce the size of the convergence region obtained as solution of the practical stability analysis. On the other hand, the velocity of convergence for the learning laws depends on the norm of errors in inverse form. This fact avoids the peaking transient behavior in the time evolution of weights that accelerates the convergence of identification error. A numerical example demonstrates the improvements achieved by the algorithm introduced in this paper compared with classical schemes with no-normalized continuous learning methods. A comparison of the identification performance achieved by the no-normalized identifier and the ones developed in this paper shows the benefits of the learning law proposed in this paper.
Correcting eddy-covariance flux underestimates over a grassland.
Twine, T. E.; Kustas, W. P.; Norman, J. M.; Cook, D. R.; Houser, P. R.; Meyers, T. P.; Prueger, J. H.; Starks, P. J.; Wesely, M. L.; Environmental Research; Univ. of Wisconsin at Madison; DOE; National Aeronautics and Space Administration; National Oceanic and Atmospheric Administrationoratory
2000-06-08
Independent measurements of the major energy balance flux components are not often consistent with the principle of conservation of energy. This is referred to as a lack of closure of the surface energy balance. Most results in the literature have shown the sum of sensible and latent heat fluxes measured by eddy covariance to be less than the difference between net radiation and soil heat fluxes. This under-measurement of sensible and latent heat fluxes by eddy-covariance instruments has occurred in numerous field experiments and among many different manufacturers of instruments. Four eddy-covariance systems consisting of the same models of instruments were set up side-by-side during the Southern Great Plains 1997 Hydrology Experiment and all systems under-measured fluxes by similar amounts. One of these eddy-covariance systems was collocated with three other types of eddy-covariance systems at different sites; all of these systems under-measured the sensible and latent-heat fluxes. The net radiometers and soil heat flux plates used in conjunction with the eddy-covariance systems were calibrated independently and measurements of net radiation and soil heat flux showed little scatter for various sites. The 10% absolute uncertainty in available energy measurements was considerably smaller than the systematic closure problem in the surface energy budget, which varied from 10 to 30%. When available-energy measurement errors are known and modest, eddy-covariance measurements of sensible and latent heat fluxes should be adjusted for closure. Although the preferred method of energy balance closure is to maintain the Bowen-ratio, the method for obtaining closure appears to be less important than assuring that eddy-covariance measurements are consistent with conservation of energy. Based on numerous measurements over a sorghum canopy, carbon dioxide fluxes, which are measured by eddy covariance, are underestimated by the same factor as eddy covariance evaporation
Covariant electromagnetic field lines
NASA Astrophysics Data System (ADS)
Hadad, Y.; Cohen, E.; Kaminer, I.; Elitzur, A. C.
2017-08-01
Faraday introduced electric field lines as a powerful tool for understanding the electric force, and these field lines are still used today in classrooms and textbooks teaching the basics of electromagnetism within the electrostatic limit. However, despite attempts at generalizing this concept beyond the electrostatic limit, such a fully relativistic field line theory still appears to be missing. In this work, we propose such a theory and define covariant electromagnetic field lines that naturally extend electric field lines to relativistic systems and general electromagnetic fields. We derive a closed-form formula for the field lines curvature in the vicinity of a charge, and show that it is related to the world line of the charge. This demonstrates how the kinematics of a charge can be derived from the geometry of the electromagnetic field lines. Such a theory may also provide new tools in modeling and analyzing electromagnetic phenomena, and may entail new insights regarding long-standing problems such as radiation-reaction and self-force. In particular, the electromagnetic field lines curvature has the attractive property of being non-singular everywhere, thus eliminating all self-field singularities without using renormalization techniques.
Covariant harmonic oscillators: 1973 revisited
NASA Technical Reports Server (NTRS)
Noz, M. E.
1993-01-01
Using the relativistic harmonic oscillator, a physical basis is given to the phenomenological wave function of Yukawa which is covariant and normalizable. It is shown that this wave function can be interpreted in terms of the unitary irreducible representations of the Poincare group. The transformation properties of these covariant wave functions are also demonstrated.
Covariance hypotheses for LANDSAT data
NASA Technical Reports Server (NTRS)
Decell, H. P.; Peters, C.
1983-01-01
Two covariance hypotheses are considered for LANDSAT data acquired by sampling fields, one an autoregressive covariance structure and the other the hypothesis of exchangeability. A minimum entropy approximation of the first structure by the second is derived and shown to have desirable properties for incorporation into a mixture density estimation procedure. Results of a rough test of the exchangeability hypothesis are presented.
Using machine learning to assess covariate balance in matching studies.
Linden, Ariel; Yarnold, Paul R
2016-12-01
In order to assess the effectiveness of matching approaches in observational studies, investigators typically present summary statistics for each observed pre-intervention covariate, with the objective of showing that matching reduces the difference in means (or proportions) between groups to as close to zero as possible. In this paper, we introduce a new approach to distinguish between study groups based on their distributions of the covariates using a machine-learning algorithm called optimal discriminant analysis (ODA). Assessing covariate balance using ODA as compared with the conventional method has several key advantages: the ability to ascertain how individuals self-select based on optimal (maximum-accuracy) cut-points on the covariates; the application to any variable metric and number of groups; its insensitivity to skewed data or outliers; and the use of accuracy measures that can be widely applied to all analyses. Moreover, ODA accepts analytic weights, thereby extending the assessment of covariate balance to any study design where weights are used for covariate adjustment. By comparing the two approaches using empirical data, we are able to demonstrate that using measures of classification accuracy as balance diagnostics produces highly consistent results to those obtained via the conventional approach (in our matched-pairs example, ODA revealed a weak statistically significant relationship not detected by the conventional approach). Thus, investigators should consider ODA as a robust complement, or perhaps alternative, to the conventional approach for assessing covariate balance in matching studies. © 2016 John Wiley & Sons, Ltd.
Enveloping Spectral Surfaces: Covariate Dependent Spectral Analysis of Categorical Time Series
Krafty, Robert T.; Xiong, Shuangyan; Stoffer, David S.; Buysse, Daniel J.; Hall, Martica
2014-01-01
Motivated by problems in Sleep Medicine and Circadian Biology, we present a method for the analysis of cross-sectional categorical time series collected from multiple subjects where the effect of static continuous-valued covariates is of interest. Toward this goal, we extend the spectral envelope methodology for the frequency domain analysis of a single categorical process to cross-sectional categorical processes that are possibly covariate dependent. The analysis introduces an enveloping spectral surface for describing the association between the frequency domain properties of qualitative time series and covariates. The resulting surface offers an intuitively interpretable measure of association between covariates and a qualitative time series by finding the maximum possible conditional power at a given frequency from scalings of the qualitative time series conditional on the covariates. The optimal scalings that maximize the power provide scientific insight by identifying the aspects of the qualitative series which have the most pronounced periodic features at a given frequency conditional on the value of the covariates. To facilitate the assessment of the dependence of the enveloping spectral surface on the covariates, we include a theory for analyzing the partial derivatives of the surface. Our approach is entirely nonparametric, and we present estimation and asymptotics in the setting of local polynomial smoothing. PMID:24790257
"Nonparametric Local Smoothing" is not image registration.
Rohlfing, Torsten; Avants, Brian
2012-11-01
Image registration is one of the most important and universally useful computational tasks in biomedical image analysis. A recent article by Xing & Qiu (IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10):2081-2092, 2011) is based on an inappropriately narrow conceptualization of the image registration problem as the task of making two images look alike, which disregards whether the established spatial correspondence is plausible. The authors propose a new algorithm, Nonparametric Local Smoothing (NLS) for image registration, but use image similarities alone as a measure of registration performance, although these measures do not relate reliably to the realism of the correspondence map. Using data obtained from its authors, we show experimentally that the method proposed by Xing & Qiu is not an effective registration algorithm. While it optimizes image similarity, it does not compute accurate, interpretable transformations. Even judged by image similarity alone, the proposed method is consistently outperformed by a simple pixel permutation algorithm, which is known by design not to compute valid registrations. This study has demonstrated that the NLS algorithm proposed recently for image registration, and published in one of the most respected journals in computer science, is not, in fact, an effective registration method at all. Our results also emphasize the general need to apply registration evaluation criteria that are sensitive to whether correspondences are accurate and mappings between images are physically interpretable. These goals cannot be achieved by simply reporting image similarities.
Nonparametric methods in actigraphy: An update
Gonçalves, Bruno S.B.; Cavalcanti, Paula R.A.; Tavares, Gracilene R.; Campos, Tania F.; Araujo, John F.
2014-01-01
Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm) results for each time interval. Simulated data showed that (1) synchronization analysis depends on sample size, and (2) fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization. PMID:26483921
Aggregate nonparametric safety analysis of traffic zones.
Siddiqui, Chowdhury; Abdel-Aty, Mohamed; Huang, Helai
2012-03-01
Exploring the significant variables related to specific types of crashes is vitally important in the planning stage of a transportation network. This paper aims to identify and examine important variables associated with total crashes and severe crashes per traffic analysis zone (TAZ) in four counties of the state of Florida by applying nonparametric statistical techniques such as data mining and random forest. The intention of investigating these factors in such aggregate level analysis is to incorporate proactive safety measures in transportation planning. Total and severe crashes per TAZ were modeled to provide predictive decision trees. The variables which carried higher weight of importance for total crashes per TAZ were - total number of intersections per TAZ, airport trip productions, light truck productions, and total roadway segment length with 35 mph posted speed limit. The other significant variables identified for total crashes were total roadway length with 15 mph posted speed limit, total roadway length with 65 mph posted speed limit, and non-home based work productions. For severe crashes, total number of intersections per TAZ, light truck productions, total roadway length with 35 mph posted speed limit, and total roadway length with 65 mph posted speed limit were among the significant variables. These variables were further verified and supported by the random forest results.
NONPARAMETRIC BAYESIAN ESTIMATION OF PERIODIC LIGHT CURVES
Wang Yuyang; Khardon, Roni; Protopapas, Pavlos
2012-09-01
Many astronomical phenomena exhibit patterns that have periodic behavior. An important step when analyzing data from such processes is the problem of identifying the period: estimating the period of a periodic function based on noisy observations made at irregularly spaced time points. This problem is still a difficult challenge despite extensive study in different disciplines. This paper makes several contributions toward solving this problem. First, we present a nonparametric Bayesian model for period finding, based on Gaussian Processes (GPs), that does not make assumptions on the shape of the periodic function. As our experiments demonstrate, the new model leads to significantly better results in period estimation especially when the light curve does not exhibit sinusoidal shape. Second, we develop a new algorithm for parameter optimization for GP which is useful when the likelihood function is very sensitive to the parameters with numerous local minima, as in the case of period estimation. The algorithm combines gradient optimization with grid search and incorporates several mechanisms to overcome the high computational complexity of GP. Third, we develop a novel approach for using domain knowledge, in the form of a probabilistic generative model, and incorporate it into the period estimation algorithm. Experimental results validate our approach showing significant improvement over existing methods.
Nonparametric Detection of Geometric Structures Over Networks
NASA Astrophysics Data System (ADS)
Zou, Shaofeng; Liang, Yingbin; Poor, H. Vincent
2017-10-01
Nonparametric detection of existence of an anomalous structure over a network is investigated. Nodes corresponding to the anomalous structure (if one exists) receive samples generated by a distribution q, which is different from a distribution p generating samples for other nodes. If an anomalous structure does not exist, all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary and unknown. The goal is to design statistically consistent tests with probability of errors converging to zero as the network size becomes asymptotically large. Kernel-based tests are proposed based on maximum mean discrepancy that measures the distance between mean embeddings of distributions into a reproducing kernel Hilbert space. Detection of an anomalous interval over a line network is first studied. Sufficient conditions on minimum and maximum sizes of candidate anomalous intervals are characterized in order to guarantee the proposed test to be consistent. It is also shown that certain necessary conditions must hold to guarantee any test to be universally consistent. Comparison of sufficient and necessary conditions yields that the proposed test is order-level optimal and nearly optimal respectively in terms of minimum and maximum sizes of candidate anomalous intervals. Generalization of the results to other networks is further developed. Numerical results are provided to demonstrate the performance of the proposed tests.
NASA Astrophysics Data System (ADS)
Kaiser, Olga; Martius, Olivia; Horenko, Illia
2017-04-01
Regression based Generalized Pareto Distribution (GPD) models are often used to describe the dynamics of hydrological threshold excesses relying on the explicit availability of all of the relevant covariates. But, in real application the complete set of relevant covariates might be not available. In this context, it was shown that under weak assumptions the influence coming from systematically missing covariates can be reflected by a nonstationary and nonhomogenous dynamics. We present a data-driven, semiparametric and an adaptive approach for spatio-temporal regression based clustering of threshold excesses in a presence of systematically missing covariates. The nonstationary and nonhomogenous behavior of threshold excesses is describes by a set of local stationary GPD models, where the parameters are expressed as regression models, and a non-parametric spatio-temporal hidden switching process. Exploiting nonparametric Finite Element time-series analysis Methodology (FEM) with Bounded Variation of the model parameters (BV) for resolving the spatio-temporal switching process, the approach goes beyond strong a priori assumptions made is standard latent class models like Mixture Models and Hidden Markov Models. Additionally, the presented FEM-BV-GPD provides a pragmatic description of the corresponding spatial dependence structure by grouping together all locations that exhibit similar behavior of the switching process. The performance of the framework is demonstrated on daily accumulated precipitation series over 17 different locations in Switzerland from 1981 till 2013 - showing that the introduced approach allows for a better description of the historical data.
Hua, Wen-Yu; Ghosh, Debashis
2015-09-01
Associating genetic markers with a multidimensional phenotype is an important yet challenging problem. In this work, we establish the equivalence between two popular methods: kernel-machine regression (KMR), and kernel distance covariance (KDC). KMR is a semiparametric regression framework that models covariate effects parametrically and genetic markers non-parametrically, while KDC represents a class of methods that include distance covariance (DC) and Hilbert-Schmidt independence criterion (HSIC), which are nonparametric tests of independence. We show that the equivalence between the score test of KMR and the KDC statistic under certain conditions can lead to a novel generalization of the KDC test that incorporates covariates. Our contributions are 3-fold: (1) establishing the equivalence between KMR and KDC; (2) showing that the principles of KMR can be applied to the interpretation of KDC; (3) the development of a broader class of KDC statistics, where the class members are statistics corresponding to different kernel combinations. Finally, we perform simulation studies and an analysis of real data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. The ADNI study suggest that SNPs of FLJ16124 exhibit pairwise interaction effects that are strongly correlated to the changes of brain region volumes. © 2015, The International Biometric Society.
Covariance Manipulation for Conjunction Assessment
NASA Technical Reports Server (NTRS)
Hejduk, M. D.
2016-01-01
The manipulation of space object covariances to try to provide additional or improved information to conjunction risk assessment is not an uncommon practice. Types of manipulation include fabricating a covariance when it is missing or unreliable to force the probability of collision (Pc) to a maximum value ('PcMax'), scaling a covariance to try to improve its realism or see the effect of covariance volatility on the calculated Pc, and constructing the equivalent of an epoch covariance at a convenient future point in the event ('covariance forecasting'). In bringing these methods to bear for Conjunction Assessment (CA) operations, however, some do not remain fully consistent with best practices for conducting risk management, some seem to be of relatively low utility, and some require additional information before they can contribute fully to risk analysis. This study describes some basic principles of modern risk management (following the Kaplan construct) and then examines the PcMax and covariance forecasting paradigms for alignment with these principles; it then further examines the expected utility of these methods in the modern CA framework. Both paradigms are found to be not without utility, but only in situations that are somewhat carefully circumscribed.
... t seek chiropractic adjustment if you have: Severe osteoporosis Numbness, tingling or loss of strength in an ... treated. Chiropractic adjustment can be effective in treating low back pain, although much of the research done shows only ...
General Galilei Covariant Gaussian Maps.
Gasbarri, Giulio; Toroš, Marko; Bassi, Angelo
2017-09-08
We characterize general non-Markovian Gaussian maps which are covariant under Galilean transformations. In particular, we consider translational and Galilean covariant maps and show that they reduce to the known Holevo result in the Markovian limit. We apply the results to discuss measures of macroscopicity based on classicalization maps, specifically addressing dissipation, Galilean covariance and non-Markovianity. We further suggest a possible generalization of the macroscopicity measure defined by Nimmrichter and Hornberger [Phys. Rev. Lett. 110, 16 (2013)PRLTAO0031-9007].
Covariance Models for Hydrological Applications
NASA Astrophysics Data System (ADS)
Hristopulos, Dionissios
2014-05-01
This methodological contribution aims to present some new covariance models with applications in the stochastic analysis of hydrological processes. More specifically, we present explicit expressions for radially symmetric, non-differentiable, Spartan covariance functions in one, two, and three dimensions. The Spartan covariance parameters include a characteristic length, an amplitude coefficient, and a rigidity coefficient which determines the shape of the covariance function. Different expressions are obtained depending on the value of the rigidity coefficient and the dimensionality. If the value of the rigidity coefficient is much larger than one, the Spartan covariance function exhibits multiscaling. Spartan covariance models are more flexible than the classical geostatatistical models (e.g., spherical, exponential). Their non-differentiability makes them suitable for modelling the properties of geological media. We also present a family of radially symmetric, infinitely differentiable Bessel-Lommel covariance functions which are valid in any dimension. These models involve combinations of Bessel and Lommel functions. They provide a generalization of the J-Bessel covariance function, and they can be used to model smooth processes with an oscillatory decay of correlations. We discuss the dependence of the integral range of the Spartan and Bessel-Lommel covariance functions on the parameters. We point out that the dependence is not uniquely specified by the characteristic length, unlike the classical geostatistical models. Finally, we define and discuss the use of the generalized spectrum for characterizing different correlation length scales; the spectrum is defined in terms of an exponent α. We show that the spectrum values obtained for exponent values less than one can be used to discriminate between mean-square continuous but non-differentiable random fields. References [1] D. T. Hristopulos and S. Elogne, 2007. Analytic properties and covariance functions of
Non-parametric approach to the study of phenotypic stability.
Ferreira, D F; Fernandes, S B; Bruzi, A T; Ramalho, M A P
2016-02-19
The aim of this study was to undertake the theoretical derivations of non-parametric methods, which use linear regressions based on rank order, for stability analyses. These methods were extension different parametric methods used for stability analyses and the result was compared with a standard non-parametric method. Intensive computational methods (e.g., bootstrap and permutation) were applied, and data from the plant-breeding program of the Biology Department of UFLA (Minas Gerais, Brazil) were used to illustrate and compare the tests. The non-parametric stability methods were effective for the evaluation of phenotypic stability. In the presence of variance heterogeneity, the non-parametric methods exhibited greater power of discrimination when determining the phenotypic stability of genotypes.
An Evolutionary Search Algorithm for Covariate Models in Population Pharmacokinetic Analysis.
Yamashita, Fumiyoshi; Fujita, Atsuto; Sasa, Yukako; Higuchi, Yuriko; Tsuda, Masahiro; Hashida, Mitsuru
2017-09-01
Building a covariate model is a crucial task in population pharmacokinetics. This study develops a novel method for automated covariate modeling based on gene expression programming (GEP), which not only enables covariate selection, but also the construction of nonpolynomial relationships between pharmacokinetic parameters and covariates. To apply GEP to the extended nonlinear least squares analysis, the parameter consolidation and initial parameter value estimation algorithms were further developed and implemented. The entire program was coded in Java. The performance of the developed covariate model was evaluated for the population pharmacokinetic data of tobramycin. In comparison with the established covariate model, goodness-of-fit of the measured data was greatly improved by using only 2 additional adjustable parameters. Ten test runs yielded the same solution. In conclusion, the systematic exploration method is a potentially powerful tool for prescreening covariate models in population pharmacokinetic analysis. Copyright © 2017 American Pharmacists Association®. Published by Elsevier Inc. All rights reserved.
Nonparametric Bayesian Modeling for Automated Database Schema Matching
Ferragut, Erik M; Laska, Jason A
2015-01-01
The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.
Effects of mergers on non-parametric morphologies
NASA Astrophysics Data System (ADS)
Bignone, Lucas A.; Tissera, Patricia B.; Sillero, Emanuel; Pedrosa, Susana E.; Pellizza, Leonardo J.; Lambas, Diego G.
2017-06-01
We study the effects of mergers on non-parametric morphologies of galaxies. We compute the Gini index, M20, asymmetry and concentration statistics for z = 0 galaxies in the Illustris simulation and compare non-parametric morphologies of major mergers, minor merges, close pairs, distant pairs and unperturbed galaxies. We determine the effectiveness of observational methods based on these statistics to select merging galaxies.
Network Reconstruction Using Nonparametric Additive ODE Models
Henderson, James; Michailidis, George
2014-01-01
Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative
Covariance specification and estimation to improve top-down Green House Gas emission estimates
NASA Astrophysics Data System (ADS)
Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Whetstone, J. R.
2015-12-01
accuracy, we perform a sensitivity study to further tune covariance parameters. Finally, we introduce a shrinkage based sample covariance estimation technique for both prior and mismatch covariances. This technique allows us to achieve similar accuracy nonparametrically in a more efficient and automated way.
Covariation neglect among novice investors.
Hedesström, Ted Martin; Svedsäter, Henrik; Gärling, Tommy
2006-09-01
In 4 experiments, undergraduates made hypothetical investment choices. In Experiment 1, participants paid more attention to the volatility of individual assets than to the volatility of aggregated portfolios. The results of Experiment 2 show that most participants diversified even when this increased risk because of covariation between the returns of individual assets. In Experiment 3, nearly half of those who seemingly attempted to minimize risk diversified even when this increased risk. These results indicate that novice investors neglect covariation when diversifying across investment alternatives. Experiment 4 established that naive diversification follows from motivation to minimize risk and showed that covariation neglect was not significantly reduced by informing participants about how covariation affects portfolio risk but was reduced by making participants systematically calculate aggregate returns for diversified portfolios. In order to counteract naive diversification, novice investors need to be better informed about the rationale underlying recommendations to diversify.
Hawking radiation and covariant anomalies
Banerjee, Rabin; Kulkarni, Shailesh
2008-01-15
Generalizing the method of Wilczek and collaborators we provide a derivation of Hawking radiation from charged black holes using only covariant gauge and gravitational anomalies. The reliability and universality of the anomaly cancellation approach to Hawking radiation is also discussed.
Relative-Error-Covariance Algorithms
NASA Technical Reports Server (NTRS)
Bierman, Gerald J.; Wolff, Peter J.
1991-01-01
Two algorithms compute error covariance of difference between optimal estimates, based on data acquired during overlapping or disjoint intervals, of state of discrete linear system. Provides quantitative measure of mutual consistency or inconsistency of estimates of states. Relative-error-covariance concept applied, to determine degree of correlation between trajectories calculated from two overlapping sets of measurements and construct real-time test of consistency of state estimates based upon recently acquired data.
[Clinical research XIX. From clinical judgment to analysis of covariance].
Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Rivas-Ruiz, Rodolfo; Talavera, Juan O
2014-01-01
The analysis of covariance (ANCOVA) is based on the general linear models. This technique involves a regression model, often multiple, in which the outcome is presented as a continuous variable, the independent variables are qualitative or are introduced into the model as dummy or dichotomous variables, and factors for which adjustment is required (covariates) can be in any measurement level (i.e. nominal, ordinal or continuous). The maneuvers can be entered into the model as 1) fixed effects, or 2) random effects. The difference between fixed effects and random effects depends on the type of information we want from the analysis of the effects. ANCOVA effect separates the independent variables from the effect of co-variables, i.e., corrects the dependent variable eliminating the influence of covariates, given that these variables change in conjunction with maneuvers or treatments, affecting the outcome variable. ANCOVA should be done only if it meets three assumptions: 1) the relationship between the covariate and the outcome is linear, 2) there is homogeneity of slopes, and 3) the covariate and the independent variable are independent from each other.
Essays in applied macroeconomics: Asymmetric price adjustment, exchange rate and treatment effect
NASA Astrophysics Data System (ADS)
Gu, Jingping
This dissertation consists of three essays. Chapter II examines the possible asymmetric response of gasoline prices to crude oil price changes using an error correction model with GARCH errors. Recent papers have looked at this issue. Some of these papers estimate a form of error correction model, but none of them accounts for autoregressive heteroskedasticity in estimation and testing for asymmetry and none of them takes the response of crude oil price into consideration. We find that time-varying volatility of gasoline price disturbances is an important feature of the data, and when we allow for asymmetric GARCH errors and investigate the system wide impulse response function, we find evidence of asymmetric adjustment to crude oil price changes in weekly retail gasoline prices. Chapter III discusses the relationship between fiscal deficit and exchange rate. Economic theory predicts that fiscal deficits can significantly affect real exchange rate movements, but existing empirical evidence reports only a weak impact of fiscal deficits on exchange rates. Based on US dollar-based real exchange rates in G5 countries and a flexible varying coefficient model, we show that the previously documented weak relationship between fiscal deficits and exchange rates may be the result of additive specifications, and that the relationship is stronger if we allow fiscal deficits to impact real exchange rates non-additively as well as nonlinearly. We find that the speed of exchange rate adjustment toward equilibrium depends on the state of the fiscal deficit; a fiscal contraction in the US can lead to less persistence in the deviation of exchange rates from fundamentals, and faster mean reversion to the equilibrium. Chapter IV proposes a kernel method to deal with the nonparametric regression model with only discrete covariates as regressors. This new approach is based on recently developed least squares cross-validation kernel smoothing method. It can not only automatically smooth
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Multilevel covariance regression with correlated random effects in the mean and variance structure.
Quintero, Adrian; Lesaffre, Emmanuel
2017-09-01
Multivariate regression methods generally assume a constant covariance matrix for the observations. In case a heteroscedastic model is needed, the parametric and nonparametric covariance regression approaches can be restrictive in the literature. We propose a multilevel regression model for the mean and covariance structure, including random intercepts in both components and allowing for correlation between them. The implied conditional covariance function can be different across clusters as a result of the random effect in the variance structure. In addition, allowing for correlation between the random intercepts in the mean and covariance makes the model convenient for skewedly distributed responses. Furthermore, it permits us to analyse directly the relation between the mean response level and the variability in each cluster. Parameter estimation is carried out via Gibbs sampling. We compare the performance of our model to other covariance modelling approaches in a simulation study. Finally, the proposed model is applied to the RN4CAST dataset to identify the variables that impact burnout of nurses in Belgium. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Brain reserve and cognitive decline: a non-parametric systematic review.
Valenzuela, Michael J; Sachdev, Perminder
2006-08-01
A previous companion paper to this report (Valenzuela and Sachdev, Psychological Medicine 2006, 36, 441-454) suggests a link between behavioural brain reserve and incident dementia; however, the issues of covariate control and ascertainment bias were not directly addressed. Our aim was to quantitatively review an independent set of longitudinal studies of cognitive change in order to clarify these factors. Cohort studies of the effects of education, occupation, and mental activities on cognitive decline were of interest. Abstracts were identified in MEDLINE (1966-September 2004), CURRENT CONTENTS (to September 2004), PsychINFO (1984-September 2004), Cochrane Library Databases and reference lists from relevant articles. Eighteen studies met inclusion criteria. Key information was extracted by both reviewers onto a standard template with a high level of agreement. Cognitive decline studies were integrated using a non-parametric method after converting outcome data onto a common effect size metric. Higher behavioural brain reserve was related to decreased longitudinal cognitive decline after control for covariates in source studies (phi=1.70, p<0.001). This effect was robust to correction for both multiple predictors and multiple outcome measures and was the result of integrating data derived from more than 47000 individuals. This study affirms that the link between behavioural brain reserve and incident dementia is most likely due to fundamentally different cognitive trajectories rather than confound factors.
Chan, Kwun Chuen Gary; Yam, Sheung Chi Phillip; Zhang, Zheng
2015-01-01
Summary The estimation of average treatment effects based on observational data is extremely important in practice and has been studied by generations of statisticians under different frameworks. Existing globally efficient estimators require non-parametric estimation of a propensity score function, an outcome regression function or both, but their performance can be poor in practical sample sizes. Without explicitly estimating either functions, we consider a wide class calibration weights constructed to attain an exact three-way balance of the moments of observed covariates among the treated, the control, and the combined group. The wide class includes exponential tilting, empirical likelihood and generalized regression as important special cases, and extends survey calibration estimators to different statistical problems and with important distinctions. Global semiparametric efficiency for the estimation of average treatment effects is established for this general class of calibration estimators. The results show that efficiency can be achieved by solely balancing the covariate distributions without resorting to direct estimation of propensity score or outcome regression function. We also propose a consistent estimator for the efficient asymptotic variance, which does not involve additional functional estimation of either the propensity score or the outcome regression functions. The proposed variance estimator outperforms existing estimators that require a direct approximation of the efficient influence function. PMID:27346982
Chan, Kwun Chuen Gary; Yam, Sheung Chi Phillip; Zhang, Zheng
2016-06-01
The estimation of average treatment effects based on observational data is extremely important in practice and has been studied by generations of statisticians under different frameworks. Existing globally efficient estimators require non-parametric estimation of a propensity score function, an outcome regression function or both, but their performance can be poor in practical sample sizes. Without explicitly estimating either functions, we consider a wide class calibration weights constructed to attain an exact three-way balance of the moments of observed covariates among the treated, the control, and the combined group. The wide class includes exponential tilting, empirical likelihood and generalized regression as important special cases, and extends survey calibration estimators to different statistical problems and with important distinctions. Global semiparametric efficiency for the estimation of average treatment effects is established for this general class of calibration estimators. The results show that efficiency can be achieved by solely balancing the covariate distributions without resorting to direct estimation of propensity score or outcome regression function. We also propose a consistent estimator for the efficient asymptotic variance, which does not involve additional functional estimation of either the propensity score or the outcome regression functions. The proposed variance estimator outperforms existing estimators that require a direct approximation of the efficient influence function.
Aghamousa, Amir; Shafieloo, Arman; Arjunwadkar, Mihir; Souradeep, Tarun E-mail: shafieloo@kasi.re.kr E-mail: tarun@iucaa.ernet.in
2015-02-01
Estimation of the angular power spectrum is one of the important steps in Cosmic Microwave Background (CMB) data analysis. Here, we present a nonparametric estimate of the temperature angular power spectrum for the Planck 2013 CMB data. The method implemented in this work is model-independent, and allows the data, rather than the model, to dictate the fit. Since one of the main targets of our analysis is to test the consistency of the ΛCDM model with Planck 2013 data, we use the nuisance parameters associated with the best-fit ΛCDM angular power spectrum to remove foreground contributions from the data at multipoles ℓ ≥50. We thus obtain a combined angular power spectrum data set together with the full covariance matrix, appropriately weighted over frequency channels. Our subsequent nonparametric analysis resolves six peaks (and five dips) up to ℓ ∼1850 in the temperature angular power spectrum. We present uncertainties in the peak/dip locations and heights at the 95% confidence level. We further show how these reflect the harmonicity of acoustic peaks, and can be used for acoustic scale estimation. Based on this nonparametric formalism, we found the best-fit ΛCDM model to be at 36% confidence distance from the center of the nonparametric confidence set—this is considerably larger than the confidence distance (9%) derived earlier from a similar analysis of the WMAP 7-year data. Another interesting result of our analysis is that at low multipoles, the Planck data do not suggest any upturn, contrary to the expectation based on the integrated Sachs-Wolfe contribution in the best-fit ΛCDM cosmology.
Covariate-free and Covariate-dependent Reliability.
Bentler, Peter M
2016-12-01
Classical test theory reliability coefficients are said to be population specific. Reliability generalization, a meta-analysis method, is the main procedure for evaluating the stability of reliability coefficients across populations. A new approach is developed to evaluate the degree of invariance of reliability coefficients to population characteristics. Factor or common variance of a reliability measure is partitioned into parts that are, and are not, influenced by control variables, resulting in a partition of reliability into a covariate-dependent and a covariate-free part. The approach can be implemented in a single sample and can be applied to a variety of reliability coefficients.
Levy Matrices and Financial Covariances
NASA Astrophysics Data System (ADS)
Burda, Zdzislaw; Jurkiewicz, Jerzy; Nowak, Maciej A.; Papp, Gabor; Zahed, Ismail
2003-10-01
In a given market, financial covariances capture the intra-stock correlations and can be used to address statistically the bulk nature of the market as a complex system. We provide a statistical analysis of three SP500 covariances with evidence for raw tail distributions. We study the stability of these tails against reshuffling for the SP500 data and show that the covariance with the strongest tails is robust, with a spectral density in remarkable agreement with random Lévy matrix theory. We study the inverse participation ratio for the three covariances. The strong localization observed at both ends of the spectral density is analogous to the localization exhibited in the random Lévy matrix ensemble. We discuss two competitive mechanisms responsible for the occurrence of an extensive and delocalized eigenvalue at the edge of the spectrum: (a) the Lévy character of the entries of the correlation matrix and (b) a sort of off-diagonal order induced by underlying inter-stock correlations. (b) can be destroyed by reshuffling, while (a) cannot. We show that the stocks with the largest scattering are the least susceptible to correlations, and likely candidates for the localized states. We introduce a simple model for price fluctuations which captures behavior of the SP500 covariances. It may be of importance for assets diversification.
Nonparametric statistical testing of EEG- and MEG-data.
Maris, Eric; Oostenveld, Robert
2007-08-15
In this paper, we show how ElectroEncephaloGraphic (EEG) and MagnetoEncephaloGraphic (MEG) data can be analyzed statistically using nonparametric techniques. Nonparametric statistical tests offer complete freedom to the user with respect to the test statistic by means of which the experimental conditions are compared. This freedom provides a straightforward way to solve the multiple comparisons problem (MCP) and it allows to incorporate biophysically motivated constraints in the test statistic, which may drastically increase the sensitivity of the statistical test. The paper is written for two audiences: (1) empirical neuroscientists looking for the most appropriate data analysis method, and (2) methodologists interested in the theoretical concepts behind nonparametric statistical tests. For the empirical neuroscientist, a large part of the paper is written in a tutorial-like fashion, enabling neuroscientists to construct their own statistical test, maximizing the sensitivity to the expected effect. And for the methodologist, it is explained why the nonparametric test is formally correct. This means that we formulate a null hypothesis (identical probability distribution in the different experimental conditions) and show that the nonparametric test controls the false alarm rate under this null hypothesis.
A class of covariate-dependent spatiotemporal covariance functions.
Reich, Brian J; Eidsvik, Jo; Guindani, Michele; Nail, Amy J; Schmidt, Alexandra M
2011-12-01
In geostatistics, it is common to model spatially distributed phenomena through an underlying stationary and isotropic spatial process. However, these assumptions are often untenable in practice because of the influence of local effects in the correlation structure. Therefore, it has been of prolonged interest in the literature to provide flexible and effective ways to model non-stationarity in the spatial effects. Arguably, due to the local nature of the problem, we might envision that the correlation structure would be highly dependent on local characteristics of the domain of study, namely the latitude, longitude and altitude of the observation sites, as well as other locally defined covariate information. In this work, we provide a flexible and computationally feasible way for allowing the correlation structure of the underlying processes to depend on local covariate information. We discuss the properties of the induced covariance functions and discuss methods to assess its dependence on local covariate information by means of a simulation study and the analysis of data observed at ozone-monitoring stations in the Southeast United States.
A relativistically covariant random walk
NASA Astrophysics Data System (ADS)
Almaguer, J.; Larralde, H.
2007-08-01
In this work we present and analyze an extremely simple relativistically covariant random walk model. In our approach, the probability density and the flow of probability arise naturally as the components of a four-vector and they are related to one another via a tensorial constitutive equation. We show that the system can be described in terms of an underlying invariant space time random walk parameterized by the number of sojourns. Finally, we obtain explicit expressions for the moments of the covariant random walk as well as for the underlying invariant random walk.
Evaluation of Tungsten Nuclear Reaction Data with Covariances
Trkov, A. Capote, R.; Kodeli, I.; Leal, L.
2008-12-15
As a follow-up of the work presented at the ND-2007 conference in Nice, additional fast reactor benchmarks were analyzed. Adjustment to the cross sections in the keV region was necessary. Evaluated neutron cross section data files for {sup 180,182,183,184,186}W isotopes were produced. Covariances were generated for all isotopes except {sup 180}W. In the resonance range the retro-active method was used. Above the resolved resonance range the covariance prior was generated by the Monte Carlo technique from nuclear model calculations with the Empire-II code. Experimental data were taken into account through the GANDR system using the generalized least-squares technique. Introducing experimental data results in relatively small changes in the cross sections, but greatly constrains the uncertainties. The covariance files are currently undergoing testing.
Evaluation of Tungsten Nuclear Reaction Data with Covariances
Trkov, A.; Capote, R.; Kodeli, I.; Leal, Luiz C.
2008-12-01
As a follow-up of the work presented at the ND-2007 conference in Nice, additional fast reactor benchmarks were analyzed. Adjustment to the cross sections in the keV region was necessary. Evaluated neutron cross section data files for 180,182,183,184,186W isotopes were produced. Covariances were generated for all isotopes except 180W. In the resonance range the retro-active method was used. Above the resolved resonance range the covariance prior was generated by the Monte Carlo technique from nuclear model calculations with the Empire-II code. Experimental data were taken into account through the GANDR system using the generalized least-squares technique. Introducing experimental data results in relatively small changes in the cross sections, but greatly constrains the uncertainties. The covariance files are currently undergoing testing.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235
Simulation study for biresponses nonparametric regression model using MARS
NASA Astrophysics Data System (ADS)
Ampulembang, Ayub Parlin; Otok, Bambang Widjanarko; Rumiati, Agnes Tuti; Budiasih
2016-02-01
In statistical modeling, especially regression analysis, we can find relationship pattern between two responses with several predictors and both of responses are correlated each other. When the assumption of the pattern is unknown, then the regression parameters could be obtained by using biresponses nonparametric regression. One method that often used in nonparametric regression with single response is Multivariate Adaptive Regression Spline (MARS). This paper aims to know how ability of MARS in estimating biresponses nonparametric regression through simulation study on different sample size (n) and variance error (σ2). We use R-square and MSE as the goodness of fit criterion. Result shows that the smaller variance error gives better estimation than the bigger one, because it gives higher R-square and smaller MSE values. Whereas the variation of sample size gives small effect on the accuracy of the model, because the value of R-square and MSE in this case tend to be the same on different sample sizes.
Mathematical models for nonparametric inferences from line transect data
Burnham, K.P.; Anderson, D.R.
1976-01-01
A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(O) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y/r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(O/r).
Randomised P-values and nonparametric procedures in multiple testing
Habiger, Joshua D.; Peña, Edsel A.
2014-01-01
The validity of many multiple hypothesis testing procedures for false discovery rate (FDR) control relies on the assumption that P-value statistics are uniformly distributed under the null hypotheses. However, this assumption fails if the test statistics have discrete distributions or if the distributional model for the observables is misspecified. A stochastic process framework is introduced that, with the aid of a uniform variate, admits P-value statistics to satisfy the uniformity condition even when test statistics have discrete distributions. This allows nonparametric tests to be used to generate P-value statistics satisfying the uniformity condition. The resulting multiple testing procedures are therefore endowed with robustness properties. Simulation studies suggest that nonparametric randomised test P-values allow for these FDR methods to perform better when the model for the observables is nonparametric or misspecified. PMID:25419090
Graph embedded nonparametric mutual information for supervised dimensionality reduction.
Bouzas, Dimitrios; Arvanitopoulos, Nikolaos; Tefas, Anastasios
2015-05-01
In this paper, we propose a novel algorithm for dimensionality reduction that uses as a criterion the mutual information (MI) between the transformed data and their corresponding class labels. The MI is a powerful criterion that can be used as a proxy to the Bayes error rate. Furthermore, recent quadratic nonparametric implementations of MI are computationally efficient and do not require any prior assumptions about the class densities. We show that the quadratic nonparametric MI can be formulated as a kernel objective in the graph embedding framework. Moreover, we propose its linear equivalent as a novel linear dimensionality reduction algorithm. The derived methods are compared against the state-of-the-art dimensionality reduction algorithms with various classifiers and on various benchmark and real-life datasets. The experimental results show that nonparametric MI as an optimization objective for dimensionality reduction gives comparable and in most of the cases better results compared with other dimensionality reduction methods.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
AFCI-2.0 Neutron Cross Section Covariance Library
Herman, M.; Herman, M; Oblozinsky, P.; Mattoon, C.M.; Pigni, M.; Hoblit, S.; Mughabghab, S.F.; Sonzogni, A.; Talou, P.; Chadwick, M.B.; Hale, G.M.; Kahler, A.C.; Kawano, T.; Little, R.C.; Yount, P.G.
2011-03-01
The cross section covariance library has been under development by BNL-LANL collaborative effort over the last three years. The project builds on two covariance libraries developed earlier, with considerable input from BNL and LANL. In 2006, international effort under WPEC Subgroup 26 produced BOLNA covariance library by putting together data, often preliminary, from various sources for most important materials for nuclear reactor technology. This was followed in 2007 by collaborative effort of four US national laboratories to produce covariances, often of modest quality - hence the name low-fidelity, for virtually complete set of materials included in ENDF/B-VII.0. The present project is focusing on covariances of 4-5 major reaction channels for 110 materials of importance for power reactors. The work started under Global Nuclear Energy Partnership (GNEP) in 2008, which changed to Advanced Fuel Cycle Initiative (AFCI) in 2009. With the 2011 release the name has changed to the Covariance Multigroup Matrix for Advanced Reactor Applications (COMMARA) version 2.0. The primary purpose of the library is to provide covariances for AFCI data adjustment project, which is focusing on the needs of fast advanced burner reactors. Responsibility of BNL was defined as developing covariances for structural materials and fission products, management of the library and coordination of the work; LANL responsibility was defined as covariances for light nuclei and actinides. The COMMARA-2.0 covariance library has been developed by BNL-LANL collaboration for Advanced Fuel Cycle Initiative applications over the period of three years, 2008-2010. It contains covariances for 110 materials relevant to fast reactor R&D. The library is to be used together with the ENDF/B-VII.0 central values of the latest official release of US files of evaluated neutron cross sections. COMMARA-2.0 library contains neutron cross section covariances for 12 light nuclei (coolants and moderators), 78 structural
Nonparametric estimation of a convex bathtub-shaped hazard function.
Jankowski, Hanna K; Wellner, Jon A
2009-11-01
In this paper, we study the nonparametric maximum likelihood estimator (MLE) of a convex hazard function. We show that the MLE is consistent and converges at a local rate of n(2/5) at points x(0) where the true hazard function is positive and strictly convex. Moreover, we establish the pointwise asymptotic distribution theory of our estimator under these same assumptions. One notable feature of the nonparametric MLE studied here is that no arbitrary choice of tuning parameter (or complicated data-adaptive selection of the tuning parameter) is required.
Nonparametric estimation of a convex bathtub-shaped hazard function
JANKOWSKI, HANNA K.; WELLNER, JON A.
2010-01-01
In this paper, we study the nonparametric maximum likelihood estimator (MLE) of a convex hazard function. We show that the MLE is consistent and converges at a local rate of n2/5 at points x0 where the true hazard function is positive and strictly convex. Moreover, we establish the pointwise asymptotic distribution theory of our estimator under these same assumptions. One notable feature of the nonparametric MLE studied here is that no arbitrary choice of tuning parameter (or complicated data-adaptive selection of the tuning parameter) is required. PMID:20383267
Condition Number Regularized Covariance Estimation.
Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala
2013-06-01
Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the "large p small n" setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required.
Condition Number Regularized Covariance Estimation*
Won, Joong-Ho; Lim, Johan; Kim, Seung-Jean; Rajaratnam, Bala
2012-01-01
Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the “large p small n” setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required. PMID:23730197
Covariation Neglect among Novice Investors
ERIC Educational Resources Information Center
Hedesstrom, Ted Martin; Svedsater, Henrik; Garling, Tommy
2006-01-01
In 4 experiments, undergraduates made hypothetical investment choices. In Experiment 1, participants paid more attention to the volatility of individual assets than to the volatility of aggregated portfolios. The results of Experiment 2 show that most participants diversified even when this increased risk because of covariation between the returns…
Covariation Neglect among Novice Investors
ERIC Educational Resources Information Center
Hedesstrom, Ted Martin; Svedsater, Henrik; Garling, Tommy
2006-01-01
In 4 experiments, undergraduates made hypothetical investment choices. In Experiment 1, participants paid more attention to the volatility of individual assets than to the volatility of aggregated portfolios. The results of Experiment 2 show that most participants diversified even when this increased risk because of covariation between the returns…
Covariance Modifications to Subspace Bases
Harris, D B
2008-11-19
Adaptive signal processing algorithms that rely upon representations of signal and noise subspaces often require updates to those representations when new data become available. Subspace representations frequently are estimated from available data with singular value (SVD) decompositions. Subspace updates require modifications to these decompositions. Updates can be performed inexpensively provided they are low-rank. A substantial literature on SVD updates exists, frequently focusing on rank-1 updates (see e.g. [Karasalo, 1986; Comon and Golub, 1990, Badeau, 2004]). In these methods, data matrices are modified by addition or deletion of a row or column, or data covariance matrices are modified by addition of the outer product of a new vector. A recent paper by Brand [2006] provides a general and efficient method for arbitrary rank updates to an SVD. The purpose of this note is to describe a closely-related method for applications where right singular vectors are not required. This note also describes the SVD updates to a particular scenario of interest in seismic array signal processing. The particular application involve updating the wideband subspace representation used in seismic subspace detectors [Harris, 2006]. These subspace detectors generalize waveform correlation algorithms to detect signals that lie in a subspace of waveforms of dimension d {ge} 1. They potentially are of interest because they extend the range of waveform variation over which these sensitive detectors apply. Subspace detectors operate by projecting waveform data from a detection window into a subspace specified by a collection of orthonormal waveform basis vectors (referred to as the template). Subspace templates are constructed from a suite of normalized, aligned master event waveforms that may be acquired by a single sensor, a three-component sensor, an array of such sensors or a sensor network. The template design process entails constructing a data matrix whose columns contain the
Covariant Formulations of Superstring Theories.
NASA Astrophysics Data System (ADS)
Mikovic, Aleksandar Radomir
1990-01-01
Chapter 1 contains a brief introduction to the subject of string theory, and tries to motivate the study of superstrings and covariant formulations. Chapter 2 describes the Green-Schwarz formulation of the superstrings. The Hamiltonian and BRST structure of the theory is analysed in the case of the superparticle. Implications for the superstring case are discussed. Chapter 3 describes the Siegel's formulation of the superstring, which contains only the first class constraints. It is shown that the physical spectrum coincides with that of the Green-Schwarz formulation. In chapter 4 we analyse the BRST structure of the Siegel's formulation. We show that the BRST charge has the wrong cohomology, and propose a modification, called first ilk, which gives the right cohomology. We also propose another superparticle model, called second ilk, which has infinitely many coordinates and constraints. We construct the complete BRST charge for it, and show that it gives the correct cohomology. In chapter 5 we analyse the properties of the covariant vertex operators and the corresponding S-matrix elements by using the Siegel's formulation. We conclude that the knowledge of the ghosts is necessary, even at the tree level, in order to obtain the correct S-matrix. In chapter 6 we attempt to calculate the superstring loops, in a covariant gauge. We calculate the vacuum-to -vacuum amplitude, which is also the cosmological constant. We show that it vanishes to all loop orders, under the assumption that the free covariant gauge-fixed action exists. In chapter 7 we present our conclusions, and briefly discuss the random lattice approach to the string theory, as a possible way of resolving the problem of the covariant quantization and the nonperturbative definition of the superstrings.
A Nonparametric Statistical Method That Improves Physician Cost of Care Analysis
Metfessel, Brent A; Greene, Robert A
2012-01-01
Objective To develop a compositing method that demonstrates improved performance compared with commonly used tests for statistical analysis of physician cost of care data. Data Source Commercial preferred provider organization (PPO) claims data for internists from a large metropolitan area. Study Design We created a nonparametric composite performance metric that maintains risk adjustment using the Wilcoxon rank-sum (WRS) test. We compared the resulting algorithm to the parametric observed-to-expected ratio, with and without a statistical test, for stability of physician cost ratings among different outlier trimming methods and across two partially overlapping time periods. Principal Findings The WRS algorithm showed significantly greater within-physician stability among several typical outlier trimming and capping methods. The algorithm also showed significantly greater within-physician stability when the same physicians were analyzed across time periods. Conclusions The nonparametric algorithm described is a more robust and more stable methodology for evaluating physician cost of care than commonly used observed-to-expected ratio techniques. Use of such an algorithm can improve physician cost assessment for important current applications such as public reporting, pay for performance, and tiered benefit design. PMID:22524195
A Simulation Comparison of Parametric and Nonparametric Dimensionality Detection Procedures
ERIC Educational Resources Information Center
Mroch, Andrew A.; Bolt, Daniel M.
2006-01-01
Recently, nonparametric methods have been proposed that provide a dimensionally based description of test structure for tests with dichotomous items. Because such methods are based on different notions of dimensionality than are assumed when using a psychometric model, it remains unclear whether these procedures might lead to a different…
A Simulation Comparison of Parametric and Nonparametric Dimensionality Detection Procedures
ERIC Educational Resources Information Center
Mroch, Andrew A.; Bolt, Daniel M.
2006-01-01
Recently, nonparametric methods have been proposed that provide a dimensionally based description of test structure for tests with dichotomous items. Because such methods are based on different notions of dimensionality than are assumed when using a psychometric model, it remains unclear whether these procedures might lead to a different…
Estimation of Spatial Dynamic Nonparametric Durbin Models with Fixed Effects
ERIC Educational Resources Information Center
Qian, Minghui; Hu, Ridong; Chen, Jianwei
2016-01-01
Spatial panel data models have been widely studied and applied in both scientific and social science disciplines, especially in the analysis of spatial influence. In this paper, we consider the spatial dynamic nonparametric Durbin model (SDNDM) with fixed effects, which takes the nonlinear factors into account base on the spatial dynamic panel…
A Unifying Framework for Teaching Nonparametric Statistical Tests
ERIC Educational Resources Information Center
Bargagliotti, Anna E.; Orrison, Michael E.
2014-01-01
Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property
Storlie, Curtis B.; Bondell, Howard D.; Reich, Brian J.; Zhang, Hao Helen
2010-01-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting. PMID:21603586
Regionalizing nonparametric models of precipitation amounts on different temporal scales
NASA Astrophysics Data System (ADS)
Mosthaf, Tobias; Bárdossy, András
2017-05-01
Parametric distribution functions are commonly used to model precipitation amounts corresponding to different durations. The precipitation amounts themselves are crucial for stochastic rainfall generators and weather generators. Nonparametric kernel density estimates (KDEs) offer a more flexible way to model precipitation amounts. As already stated in their name, these models do not exhibit parameters that can be easily regionalized to run rainfall generators at ungauged locations as well as at gauged locations. To overcome this deficiency, we present a new interpolation scheme for nonparametric models and evaluate it for different temporal resolutions ranging from hourly to monthly. During the evaluation, the nonparametric methods are compared to commonly used parametric models like the two-parameter gamma and the mixed-exponential distribution. As water volume is considered to be an essential parameter for applications like flood modeling, a Lorenz-curve-based criterion is also introduced. To add value to the estimation of data at sub-daily resolutions, we incorporated the plentiful daily measurements in the interpolation scheme, and this idea was evaluated. The study region is the federal state of Baden-Württemberg in the southwest of Germany with more than 500 rain gauges. The validation results show that the newly proposed nonparametric interpolation scheme provides reasonable results and that the incorporation of daily values in the regionalization of sub-daily models is very beneficial.
Joint Entropy Minimization for Learning in Nonparametric Framework
2006-06-09
Tibshirani, G. Sherlock , W. C. Chan, T. C. Greiner, D. D. Weisenburger, J. O. Armitage, R. Warnke, R. Levy, W. Wilson, M. R. Grever, J. C. Byrd, D. Botstein, P...Entropy Minimization for Learning in Nonparametric Framework 33 [11] D.L. Collins, A.P. Zijdenbos, J.G. Kollokian, N.J. Sled, C.J. Kabani, C.J. Holmes
Estimation of Spatial Dynamic Nonparametric Durbin Models with Fixed Effects
ERIC Educational Resources Information Center
Qian, Minghui; Hu, Ridong; Chen, Jianwei
2016-01-01
Spatial panel data models have been widely studied and applied in both scientific and social science disciplines, especially in the analysis of spatial influence. In this paper, we consider the spatial dynamic nonparametric Durbin model (SDNDM) with fixed effects, which takes the nonlinear factors into account base on the spatial dynamic panel…
Three Classes of Nonparametric Differential Step Functioning Effect Estimators
ERIC Educational Resources Information Center
Penfield, Randall D.
2008-01-01
The examination of measurement invariance in polytomous items is complicated by the possibility that the magnitude and sign of lack of invariance may vary across the steps underlying the set of polytomous response options, a concept referred to as differential step functioning (DSF). This article describes three classes of nonparametric DSF effect…
Three Classes of Nonparametric Differential Step Functioning Effect Estimators
ERIC Educational Resources Information Center
Penfield, Randall D.
2008-01-01
The examination of measurement invariance in polytomous items is complicated by the possibility that the magnitude and sign of lack of invariance may vary across the steps underlying the set of polytomous response options, a concept referred to as differential step functioning (DSF). This article describes three classes of nonparametric DSF effect…
A Unifying Framework for Teaching Nonparametric Statistical Tests
ERIC Educational Resources Information Center
Bargagliotti, Anna E.; Orrison, Michael E.
2014-01-01
Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…
Nonparametric Item Response Curve Estimation with Correction for Measurement Error
ERIC Educational Resources Information Center
Guo, Hongwen; Sinharay, Sandip
2011-01-01
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
A New Nonparametric Levene Test for Equal Variances
ERIC Educational Resources Information Center
Nordstokke, David W.; Zumbo, Bruno D.
2010-01-01
Tests of the equality of variances are sometimes used on their own to compare variability across groups of experimental or non-experimental conditions but they are most often used alongside other methods to support assumptions made about variances. A new nonparametric test of equality of variances is described and compared to current "gold…
The importance of covariate selection in controlling for selection bias in observational studies.
Steiner, Peter M; Cook, Thomas D; Shadish, William R; Clark, M H
2010-09-01
The assumption of strongly ignorable treatment assignment is required for eliminating selection bias in observational studies. To meet this assumption, researchers often rely on a strategy of selecting covariates that they think will control for selection bias. Theory indicates that the most important covariates are those highly correlated with both the real selection process and the potential outcomes. However, when planning a study, it is rarely possible to identify such covariates with certainty. In this article, we report on an extensive reanalysis of a within-study comparison that contrasts a randomized experiment and a quasi-experiment. Various covariate sets were used to adjust for initial group differences in the quasi-experiment that was characterized by self-selection into treatment. The adjusted effect sizes were then compared with the experimental ones to identify which individual covariates, and which conceptually grouped sets of covariates, were responsible for the high degree of bias reduction achieved in the adjusted quasi-experiment. Such results provide strong clues about preferred strategies for identifying the covariates most likely to reduce bias when planning a study and when the true selection process is not known.
A simulation-based marginal method for longitudinal data with dropout and mismeasured covariates.
Yi, Grace Y
2008-07-01
Longitudinal data often contain missing observations and error-prone covariates. Extensive attention has been directed to analysis methods to adjust for the bias induced by missing observations. There is relatively little work on investigating the effects of covariate measurement error on estimation of the response parameters, especially on simultaneously accounting for the biases induced by both missing values and mismeasured covariates. It is not clear what the impact of ignoring measurement error is when analyzing longitudinal data with both missing observations and error-prone covariates. In this article, we study the effects of covariate measurement error on estimation of the response parameters for longitudinal studies. We develop an inference method that adjusts for the biases induced by measurement error as well as by missingness. The proposed method does not require the full specification of the distribution of the response vector but only requires modeling its mean and variance structures. Furthermore, the proposed method employs the so-called functional modeling strategy to handle the covariate process, with the distribution of covariates left unspecified. These features, plus the simplicity of implementation, make the proposed method very attractive. In this paper, we establish the asymptotic properties for the resulting estimators. With the proposed method, we conduct sensitivity analyses on a cohort data set arising from the Framingham Heart Study. Simulation studies are carried out to evaluate the impact of ignoring covariate measurement error and to assess the performance of the proposed method.
Ciolino, Jody; Zhao, Wenle; Martin, Renee’; Palesch, Yuko
2014-01-01
Motivated by potentially serious imbalances of continuous baseline covariates in clinical trials, we investigated the cost in statistical power of ignoring the balance of these covariates in treatment allocation design for a logistic regression model. Based on data from a clinical trial of acute ischemic stroke treatment, computer simulations were used to create scenarios varying from best possible baseline covariate balance to worst possible imbalance, with multiple balance levels between the two extremes. The likelihood of each scenario occurring under simple randomization was evaluated. Power of the main effect test for treatment was examined. Our simulation results show that the worst possible imbalance is highly unlikely, but it can still occur under simple random allocation. Also, power loss could be nontrivial if balancing distributions of important continuous covariates were ignored even if adjustment is made in analysis for important covariates. This situation, although unlikely, is more serious for trials with a small sample size and for covariates with large influence on primary outcome. These results suggest that attempts should be made to balance known prognostic continuous covariates at the design phase of a clinical trial even when adjustment is planned for these covariates at the analysis. PMID:21078415
Mayr, Andreas; Hothorn, Torsten; Fenske, Nora
2012-01-25
The construction of prediction intervals (PIs) for future body mass index (BMI) values of individual children based on a recent German birth cohort study with n = 2007 children is problematic for standard parametric approaches, as the BMI distribution in childhood is typically skewed depending on age. We avoid distributional assumptions by directly modelling the borders of PIs by additive quantile regression, estimated by boosting. We point out the concept of conditional coverage to prove the accuracy of PIs. As conditional coverage can hardly be evaluated in practical applications, we conduct a simulation study before fitting child- and covariate-specific PIs for future BMI values and BMI patterns for the present data. The results of our simulation study suggest that PIs fitted by quantile boosting cover future observations with the predefined coverage probability and outperform the benchmark approach. For the prediction of future BMI values, quantile boosting automatically selects informative covariates and adapts to the age-specific skewness of the BMI distribution. The lengths of the estimated PIs are child-specific and increase, as expected, with the age of the child. Quantile boosting is a promising approach to construct PIs with correct conditional coverage in a non-parametric way. It is in particular suitable for the prediction of BMI patterns depending on covariates, since it provides an interpretable predictor structure, inherent variable selection properties and can even account for longitudinal data structures.
ERIC Educational Resources Information Center
Gierl, Mark J.; Bolt, Daniel M.
2001-01-01
Presents an overview of nonparametric regression as it allies to differential item functioning analysis and then provides three examples to illustrate how nonparametric regression can be applied to multilingual, multicultural data to study group differences. (SLD)
Szekeres models: a covariant approach
NASA Astrophysics Data System (ADS)
Apostolopoulos, Pantelis S.
2017-05-01
We exploit the 1 + 1 + 2 formalism to covariantly describe the inhomogeneous and anisotropic Szekeres models. It is shown that an average scale length can be defined covariantly which satisfies a 2d equation of motion driven from the effective gravitational mass (EGM) contained in the dust cloud. The contributions to the EGM are encoded to the energy density of the dust fluid and the free gravitational field E ab . We show that the quasi-symmetric property of the Szekeres models is justified through the existence of 3 independent intrinsic Killing vector fields (IKVFs). In addition the notions of the apparent and absolute apparent horizons are briefly discussed and we give an alternative gauge-invariant form to define them in terms of the kinematical variables of the spacelike congruences. We argue that the proposed program can be used in order to express Sachs’ optical equations in a covariant form and analyze the confrontation of a spatially inhomogeneous irrotational overdense fluid model with the observational data.
Understanding covariate shift in model performance
McGaughey, Georgia; Walters, W. Patrick; Goldman, Brian
2016-01-01
Three (3) different methods (logistic regression, covariate shift and k-NN) were applied to five (5) internal datasets and one (1) external, publically available dataset where covariate shift existed. In all cases, k-NN’s performance was inferior to either logistic regression or covariate shift. Surprisingly, there was no obvious advantage for using covariate shift to reweight the training data in the examined datasets. PMID:27803797
Liu, Xiaofeng Steven
2011-05-01
The use of covariates is commonly believed to reduce the unexplained error variance and the standard error for the comparison of treatment means, but the reduction in the standard error is neither guaranteed nor uniform over different sample sizes. The covariate mean differences between the treatment conditions can inflate the standard error of the covariate-adjusted mean difference and can actually produce a larger standard error for the adjusted mean difference than that for the unadjusted mean difference. When the covariate observations are conceived of as randomly varying from one study to another, the covariate mean differences can be related to a Hotelling's T(2) . Using this Hotelling's T(2) statistic, one can always find a minimum sample size to achieve a high probability of reducing the standard error and confidence interval width for the adjusted mean difference.
Lorentz-covariant dissipative Lagrangian systems
NASA Technical Reports Server (NTRS)
Kaufman, A. N.
1985-01-01
The concept of dissipative Hamiltonian system is converted to Lorentz-covariant form, with evolution generated jointly by two scalar functionals, the Lagrangian action and the global entropy. A bracket formulation yields the local covariant laws of energy-momentum conservation and of entropy production. The formalism is illustrated by a derivation of the covariant Landau kinetic equation.
Are Maxwell's equations Lorentz-covariant?
NASA Astrophysics Data System (ADS)
Redžić, D. V.
2017-01-01
It is stated in many textbooks that Maxwell's equations are manifestly covariant when written down in tensorial form. We recall that tensorial form of Maxwell's equations does not secure their tensorial contents; they become covariant by postulating certain transformation properties of field functions. That fact should be stressed when teaching about the covariance of Maxwell's equations.
Lorentz-covariant dissipative Lagrangian systems
NASA Technical Reports Server (NTRS)
Kaufman, A. N.
1985-01-01
The concept of dissipative Hamiltonian system is converted to Lorentz-covariant form, with evolution generated jointly by two scalar functionals, the Lagrangian action and the global entropy. A bracket formulation yields the local covariant laws of energy-momentum conservation and of entropy production. The formalism is illustrated by a derivation of the covariant Landau kinetic equation.
Neutron Cross Section Covariances for Structural Materials and Fission Products
NASA Astrophysics Data System (ADS)
Hoblit, S.; Cho, Y.-S.; Herman, M.; Mattoon, C. M.; Mughabghab, S. F.; Obložinský, P.; Pigni, M. T.; Sonzogni, A. A.
2011-12-01
We describe neutron cross section covariances for 78 structural materials and fission products produced for the new US evaluated nuclear reaction library ENDF/B-VII.1. Neutron incident energies cover full range from 10 eV to 20 MeV and covariances are primarily provided for capture, elastic and inelastic scattering as well as (n,2n). The list of materials follows priorities defined by the Advanced Fuel Cycle Initiative, the major application being data adjustment for advanced fast reactor systems. Thus, in addition to 28 structural materials and 49 fission products, the list includes also 23Na which is important fast reactor coolant. Due to extensive amount of materials, we adopted a variety of methodologies depending on the priority of a specific material. In the resolved resonance region we primarily used resonance parameter uncertainties given in Atlas of Neutron Resonances and either applied the kernel approximation to propagate these uncertainties into cross section uncertainties or resorted to simplified estimates based on integral quantities. For several priority materials we adopted MF32 covariances produced by SAMMY at ORNL, modified by us by adding MF33 covariances to account for systematic uncertainties. In the fast neutron region we resorted to three methods. The most sophisticated was EMPIRE-KALMAN method which combines experimental data from EXFOR library with nuclear reaction modeling and least-squares fitting. The two other methods used simplified estimates, either based on the propagation of nuclear reaction model parameter uncertainties or on a dispersion analysis of central cross section values in recent evaluated data files. All covariances were subject to quality assurance procedures adopted recently by CSEWG. In addition, tools were developed to allow inspection of processed covariances and computed integral quantities, and for comparing these values to data from the Atlas and the astrophysics database KADoNiS.
Harry, H.H.
1988-03-11
Abstract and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus. 3 figs.
Harry, Herbert H.
1989-01-01
Apparatus and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus.
Fu, Wei; Simonoff, Jeffrey S
2016-12-26
SUMMARYTree methods (recursive partitioning) are a popular class of nonparametric methods for analyzing data. One extension of the basic tree methodology is the survival tree, which applies recursive partitioning to censored survival data. There are several existing survival tree methods in the literature, which are mainly designed for right-censored data. We propose two new survival trees for left-truncated and right-censored (LTRC) data, which can be seen as a generalization of the traditional survival tree for right-censored data. Further, we show that such trees can be used to analyze survival data with time-varying covariates, essentially building a time-varying covariates survival tree. Implementation of the methods is easy, and simulations and real data analysis results show that the proposed methods work well for LTRC data and survival data with time-varying covariates, respectively.
Covariance Evaluation Methodology for Neutron Cross Sections
Herman,M.; Arcilla, R.; Mattoon, C.M.; Mughabghab, S.F.; Oblozinsky, P.; Pigni, M.; Pritychenko, b.; Songzoni, A.A.
2008-09-01
We present the NNDC-BNL methodology for estimating neutron cross section covariances in thermal, resolved resonance, unresolved resonance and fast neutron regions. The three key elements of the methodology are Atlas of Neutron Resonances, nuclear reaction code EMPIRE, and the Bayesian code implementing Kalman filter concept. The covariance data processing, visualization and distribution capabilities are integral components of the NNDC methodology. We illustrate its application on examples including relatively detailed evaluation of covariances for two individual nuclei and massive production of simple covariance estimates for 307 materials. Certain peculiarities regarding evaluation of covariances for resolved resonances and the consistency between resonance parameter uncertainties and thermal cross section uncertainties are also discussed.
Phase-covariant quantum benchmarks
Calsamiglia, J.; Aspachs, M.; Munoz-Tapia, R.; Bagan, E.
2009-05-15
We give a quantum benchmark for teleportation and quantum storage experiments suited for pure and mixed test states. The benchmark is based on the average fidelity over a family of phase-covariant states and certifies that an experiment cannot be emulated by a classical setup, i.e., by a measure-and-prepare scheme. We give an analytical solution for qubits, which shows important differences with standard state estimation approach, and compute the value of the benchmark for coherent and squeezed states, both pure and mixed.
Nonparametric estimation of plant density by the distance method
Patil, S.A.; Burnham, K.P.; Kovner, J.L.
1979-01-01
A relation between the plant density and the probability density function of the nearest neighbor distance (squared) from a random point is established under fairly broad conditions. Based upon this relationship, a nonparametric estimator for the plant density is developed and presented in terms of order statistics. Consistency and asymptotic normality of the estimator are discussed. An interval estimator for the density is obtained. The modifications of this estimator and its variance are given when the distribution is truncated. Simulation results are presented for regular, random and aggregated populations to illustrate the nonparametric estimator and its variance. A numerical example from field data is given. Merits and deficiencies of the estimator are discussed with regard to its robustness and variance.
Nonparametric comparison for panel count data with unequal observation processes.
Zhao, Xingqiu; Sun, Jianguo
2011-09-01
This article considers nonparametric comparison of several treatment groups based on panel count data, which often occur in, among others, medical follow-up studies and reliability experiments concerning recurrent events. For the problem, most of the existing procedures require that observation processes are identical across different treatment groups among other requirements. We propose a new class of nonparametric test procedures that allow different observation processes. The new test statistics are constructed based on the integrated weighted differences between the estimated mean functions of the underlying recurrent event processes. The asymptotic distributions of the proposed test statistics are established and their finite-sample properties are examined through Monte Carlo simulations, which indicate that the proposed approach works well for practical situations. An illustrative example is provided. © 2010, The International Biometric Society.
Monte Carlo methods for nonparametric regression with heteroscedastic measurement error.
McIntyre, Julie; Johnson, Brent A; Rappaport, Stephen M
2017-09-15
Nonparametric regression is a fundamental problem in statistics but challenging when the independent variable is measured with error. Among the first approaches was an extension of deconvoluting kernel density estimators for homescedastic measurement error. The main contribution of this article is to propose a new simulation-based nonparametric regression estimator for the heteroscedastic measurement error case. Similar to some earlier proposals, our estimator is built on principles underlying deconvoluting kernel density estimators. However, the proposed estimation procedure uses Monte Carlo methods for estimating nonlinear functions of a normal mean, which is different than any previous estimator. We show that the estimator has desirable operating characteristics in both large and small samples and apply the method to a study of benzene exposure in Chinese factory workers. © 2017, The International Biometric Society.
Nonparametric instrumental regression with non-convex constraints
NASA Astrophysics Data System (ADS)
Grasmair, M.; Scherzer, O.; Vanhems, A.
2013-03-01
This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.
Nonparametric inference procedures for multistate life table analysis.
Dow, M M
1985-01-01
Recent generalizations of the classical single state life table procedures to the multistate case provide the means to analyze simultaneously the mobility and mortality experience of 1 or more cohorts. This paper examines fairly general nonparametric combinatorial matrix procedures, known as quadratic assignment, as an analysis technic of various transitional patterns commonly generated by cohorts over the life cycle course. To some degree, the output from a multistate life table analysis suggests inference procedures. In his discussion of multstate life table construction features, the author focuses on the matrix formulation of the problem. He then presents several examples of the proposed nonparametric procedures. Data for the mobility and life expectancies at birth matrices come from the 458 member Cayo Santiago rhesus monkey colony. The author's matrix combinatorial approach to hypotheses testing may prove to be a useful inferential strategy in several multidimensional demographic areas.
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
A Nonparametric Approach for Mapping Quantitative Trait Loci
Kruglyak, L.; Lander, E. S.
1995-01-01
Genetic mapping of quantitative trait loci (QTLs) is performed typically by using a parametric approach, based on the assumption that the phenotype follows a normal distribution. Many traits of interest, however, are not normally distributed. In this paper, we present a nonparametric approach to QTL mapping applicable to any phenotypic distribution. The method is based on a statistic Z(w), which generalizes the nonparametric Wilcoxon rank-sum test to the situation of whole-genome search by interval mapping. We determine the appropriate significance level for the statistic Z(w), by showing that its asymptotic null distribution follows an Ornstein-Uhlenbeck process. These results provide a robust, distribution-free method for mapping QTLs. PMID:7768449
Recent advances in nonparametric function estimation: Hydrologic applications
NASA Astrophysics Data System (ADS)
Lall, U.
1995-07-01
Nonparametric function estimation refers to methods that strive to approximate a target function locally, i.e., using data from a "small" neighborhood of the point of estimate. "Weak" assumptions, such as continuity of the target function and its differentiability to some order in the neighborhood, rather than an a priori assumption of the global form (e.g., linear or quadratic) of the entire target function are used. Traditionally, parametric assumptions (e.g., hydraulic conductivity is log normally distributed, floods follow a log Pearson III (LP3) distribution, annual stream flow is either log normal or gamma distributed, daily rainfall amounts are exponentially distributed, and the variograms of spatial hydrologic data follow a power law) have dominated statistical hydrologic estimation. Applications of nonparametric methods to some classical problems (frequency analysis, classification, spatial surface fitting, trend analysis, time series forecasting and simulation) of stochastic hydrology are reviewed.
Covariance matrices for use in criticality safety predictability studies
Derrien, H.; Larson, N.M.; Leal, L.C.
1997-09-01
Criticality predictability applications require as input the best available information on fissile and other nuclides. In recent years important work has been performed in the analysis of neutron transmission and cross-section data for fissile nuclei in the resonance region by using the computer code SAMMY. The code uses Bayes method (a form of generalized least squares) for sequential analyses of several sets of experimental data. Values for Reich-Moore resonance parameters, their covariances, and the derivatives with respect to the adjusted parameters (data sensitivities) are obtained. In general, the parameter file contains several thousand values and the dimension of the covariance matrices is correspondingly large. These matrices are not reported in the current evaluated data files due to their large dimensions and to the inadequacy of the file formats. The present work has two goals: the first is to calculate the covariances of group-averaged cross sections from the covariance files generated by SAMMY, because these can be more readily utilized in criticality predictability calculations. The second goal is to propose a more practical interface between SAMMY and the evaluated files. Examples are given for {sup 235}U in the popular 199- and 238-group structures, using the latest ORNL evaluation of the {sup 235}U resonance parameters.
Nonparametric functional mapping of quantitative trait loci underlying programmed cell death.
Cui, Yuehua; Wu, Rongling; Casella, George; Zhu, Jun
2008-01-01
The development of an organism represents a complex dynamic process, which is controlled by a network of genes and multiple environmental factors. Programmed cell death (PCD), a physiological cell suicide process, occurs during the development of most organisms and is, typically, a complex dynamic trait. Understanding how genes control this complex developmental process has been a long-standing topic in PCD studies. In this article, we propose a nonparametric model, based on orthogonal Legendre polynomials, to map genes or quantitative trait loci (QTLs) that govern the dynamic features of the PCD process. The model is built under the maximum likelihood-based functional mapping framework and is implemented with the EM algorithm. A general information criterion is proposed for selecting the optimal Legendre order that best fits the dynamic pattern of the PCD process. The consistency of the order selection criterion is established. A nonstationary structured antedependence model (SAD) is applied to model the covariance structure among the phenotypes measured at different time points. The developed model generates a number of hypothesis tests regarding the genetic control mechanism of the PCD process. Extensive simulation studies are conducted to investigate the statistical behavior of the model. Finally, we apply the model to a rice tiller number data set in which several QTLs are identified. The developed model provides a quantitative and testable framework for assessing the interplay between genes and the developmental PCD process, and will have great implications for elucidating the genetic architecture of the PCD process.
Non-parametric estimation of gap time survival functions for ordered multivariate failure time data.
Schaubel, Douglas E; Cai, Jianwen
2004-06-30
Times between sequentially ordered events (gap times) are often of interest in biomedical studies. For example, in a cancer study, the gap times from incidence-to-remission and remission-to-recurrence may be examined. Such data are usually subject to right censoring, and within-subject failure times are generally not independent. Statistical challenges in the analysis of the second and subsequent gap times include induced dependent censoring and non-identifiability of the marginal distributions. We propose a non-parametric method for constructing one-sample estimators of conditional gap-time specific survival functions. The estimators are uniformly consistent and, upon standardization, converge weakly to a zero-mean Gaussian process, with a covariance function which can be consistently estimated. Simulation studies reveal that the asymptotic approximations are appropriate for finite samples. Methods for confidence bands are provided. The proposed methods are illustrated on a renal failure data set, where the probabilities of transplant wait-listing and kidney transplantation are of interest.
Weiß, Verena; Schmidt, Matthias; Hellmich, Martin
2015-01-01
Introduction: For survival data the coefficient of determination cannot be used to describe how good a model fits to the data. Therefore, several measures of explained variation for survival data have been proposed in recent years. Methods: We analyse an existing measure of explained variation with regard to minimisation aspects and demonstrate that these are not fulfilled for the measure. Results: In analogy to the least squares method from linear regression analysis we develop a novel measure for categorical covariates which is based only on the Kaplan-Meier estimator. Hence, the novel measure is a completely nonparametric measure with an easy graphical interpretation. For the novel measure different weighting possibilities are available and a statistical test of significance can be performed. Eventually, we apply the novel measure and further measures of explained variation to a dataset comprising persons with a histopathological papillary thyroid carcinoma. Conclusion: We propose a novel measure of explained variation with a comprehensible derivation as well as a graphical interpretation, which may be used in further analyses with survival data. PMID:26550007
A Bayesian nonparametric method for prediction in EST analysis.
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-09-14
Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample.
Optimum nonparametric estimation of population density based on ordered distances
Patil, S.A.; Kovner, J.L.; Burnham, Kenneth P.
1982-01-01
The asymptotic mean and error mean square are determined for the nonparametric estimator of plant density by distance sampling proposed by Patil, Burnham and Kovner (1979, Biometrics 35, 597-604. On the basis of these formulae, a bias-reduced version of this estimator is given, and its specific form is determined which gives minimum mean square error under varying assumptions about the true probability density function of the sampled data. Extension is given to line-transect sampling.
Nonparametric estimation of Fisher information from real data
NASA Astrophysics Data System (ADS)
Har-Shemesh, Omri; Quax, Rick; Miñano, Borja; Hoekstra, Alfons G.; Sloot, Peter M. A.
2016-02-01
The Fisher information matrix (FIM) is a widely used measure for applications including statistical inference, information geometry, experiment design, and the study of criticality in biological systems. The FIM is defined for a parametric family of probability distributions and its estimation from data follows one of two paths: either the distribution is assumed to be known and the parameters are estimated from the data or the parameters are known and the distribution is estimated from the data. We consider the latter case which is applicable, for example, to experiments where the parameters are controlled by the experimenter and a complicated relation exists between the input parameters and the resulting distribution of the data. Since we assume that the distribution is unknown, we use a nonparametric density estimation on the data and then compute the FIM directly from that estimate using a finite-difference approximation to estimate the derivatives in its definition. The accuracy of the estimate depends on both the method of nonparametric estimation and the difference Δ θ between the densities used in the finite-difference formula. We develop an approach for choosing the optimal parameter difference Δ θ based on large deviations theory and compare two nonparametric density estimation methods, the Gaussian kernel density estimator and a novel density estimation using field theory method. We also compare these two methods to a recently published approach that circumvents the need for density estimation by estimating a nonparametric f divergence and using it to approximate the FIM. We use the Fisher information of the normal distribution to validate our method and as a more involved example we compute the temperature component of the FIM in the two-dimensional Ising model and show that it obeys the expected relation to the heat capacity and therefore peaks at the phase transition at the correct critical temperature.
Parametric and nonparametric linkage analysis: A unified multipoint approach
Kruglyak, L.; Daly, M.J.; Reeve-Daly, M.P.; Lander, E.S.
1996-06-01
In complex disease studies, it is crucial to perform multipoint linkage analysis with many markers and to use robust nonparametric methods that take account of all pedigree information. Currently available methods fall short in both regards. In this paper, we describe how to extract complete multipoint inheritance information from general pedigrees of moderate size. This information is captured in the multipoint inheritance distribution, which provides a framework for a unified approach to both parametric and nonparametric methods of linkage analysis. Specifically, the approach includes the following: (1) Rapid exact computation of multipoint LOD scores involving dozens of highly polymorphic markers, even in the presence of loops and missing data. (2) Nonparametric linkage (NPL) analysis, a powerful new approach to pedigree analysis. We show that NPL is robust to uncertainty about mode of inheritance, is much more powerful than commonly used nonparametric methods, and loses little power relative to parametric linkage analysis. NPL thus appears to be the method of choice for pedigree studies of complex traits. (3) Information-content mapping, which measures the fraction of the total inheritance information extracted by the available marker data and points out the regions in which typing additional markers is most useful. (4) Maximum-likelihood reconstruction of many-marker haplotypes, even in pedigrees with missing data. We have implemented NPL analysis, LOD-score computation, information-content mapping, and haplotype reconstruction in a new computer package, GENEHUNTER. The package allows efficient multipoint analysis of pedigree data to be performed rapidly in a single user-friendly environment. 34 refs., 9 figs., 2 tabs.
A Nonparametric Approach to Segmentation of Ladar Images
2012-12-01
Recognition, 26(9):1277–1294, 1993. [72] Papoulis , A. Probability , Random Variables, and Stochastic Processes. McGraw-Hill, Inc., 3rd. edition, 1991. [73...into distinct “phases” of imagery. The segmentation method is initialized using nonparametric probability density estimation. The resulting probability ...density is sliced piecewise into probability range bins, and the dominant object regions in each slice are traced and labeled. Plane fitting of each
Covariate analysis of survival data: a small-sample study of Cox's model
Johnson, M.E.; Tolley, H.D.; Bryson, M.C.; Goldman, A.S.
1982-09-01
Cox's proportional-hazards model is frequently used to adjust for covariate effects in survival-data analysis. The small-sample performances of the maximum partial likelihood estimators of the regression parameters in a two-covariate hazard function model are evaluated with respect to bias, variance, and power in hypothesis tests. Previous Monte Carlo work on the two-sample problem is reviewed.
Nonparametric Analysis of Bivariate Gap Time with Competing Risks
Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng
2016-01-01
Summary This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall’s tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article. PMID:26990686
Flexible variable selection for recovering sparsity in nonadditive nonparametric models.
Fang, Zaili; Kim, Inyoung; Schaumont, Patrick
2016-12-01
Variable selection for recovering sparsity in nonadditive and nonparametric models with high-dimensional variables has been challenging. This problem becomes even more difficult due to complications in modeling unknown interaction terms among high-dimensional variables. There is currently no variable selection method to overcome these limitations. Hence, in this article we propose a variable selection approach that is developed by connecting a kernel machine with the nonparametric regression model. The advantages of our approach are that it can: (i) recover the sparsity; (ii) automatically model unknown and complicated interactions; (iii) connect with several existing approaches including linear nonnegative garrote and multiple kernel learning; and (iv) provide flexibility for both additive and nonadditive nonparametric models. Our approach can be viewed as a nonlinear version of a nonnegative garrote method. We model the smoothing function by a Least Squares Kernel Machine (LSKM) and construct the nonnegative garrote objective function as the function of the sparse scale parameters of kernel machine to recover sparsity of input variables whose relevances to the response are measured by the scale parameters. We also provide the asymptotic properties of our approach. We show that sparsistency is satisfied with consistent initial kernel function coefficients under certain conditions. An efficient coordinate descent/backfitting algorithm is developed. A resampling procedure for our variable selection methodology is also proposed to improve the power. © 2016, The International Biometric Society.
Robust estimation for partially linear models with large-dimensional covariates.
Zhu, LiPing; Li, RunZe; Cui, HengJian
2013-10-01
We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of [Formula: see text], where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures.
Relativistic covariance of Ohm's law
NASA Astrophysics Data System (ADS)
Starke, R.; Schober, G. A. H.
2016-04-01
The derivation of Lorentz-covariant generalizations of Ohm's law has been a long-term issue in theoretical physics with deep implications for the study of relativistic effects in optical and atomic physics. In this article, we propose an alternative route to this problem, which is motivated by the tremendous progress in first-principles materials physics in general and ab initio electronic structure theory in particular. We start from the most general, Lorentz-covariant first-order response law, which is written in terms of the fundamental response tensor χμ ν relating induced four-currents to external four-potentials. By showing the equivalence of this description to Ohm's law, we prove the validity of Ohm's law in every inertial frame. We further use the universal relation between χμ ν and the microscopic conductivity tensor σkℓ to derive a fully relativistic transformation law for the latter, which includes all effects of anisotropy and relativistic retardation. In the special case of a constant, scalar conductivity, this transformation law can be used to rederive a standard textbook generalization of Ohm's law.
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.
Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W
2016-05-13
Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.
Schillebeeckx, P.; Becker, B.; Danon, Y.; Guber, K.; Harada, H.; Heyse, J.; Junghans, A.R.; Kopecky, S.; Massimi, C.; Moxon, M.C.; Otuka, N.; Sirakov, I.; Volev, K.
2012-12-15
Cross section data in the resolved and unresolved resonance region are represented by nuclear reaction formalisms using parameters which are determined by fitting them to experimental data. Therefore, the quality of evaluated cross sections in the resonance region strongly depends on the experimental data used in the adjustment process and an assessment of the experimental covariance data is of primary importance in determining the accuracy of evaluated cross section data. In this contribution, uncertainty components of experimental observables resulting from total and reaction cross section experiments are quantified by identifying the metrological parameters involved in the measurement, data reduction and analysis process. In addition, different methods that can be applied to propagate the covariance of the experimental observables (i.e. transmission and reaction yields) to the covariance of the resonance parameters are discussed and compared. The methods being discussed are: conventional uncertainty propagation, Monte Carlo sampling and marginalization. It is demonstrated that the final covariance matrix of the resonance parameters not only strongly depends on the type of experimental observables used in the adjustment process, the experimental conditions and the characteristics of the resonance structure, but also on the method that is used to propagate the covariances. Finally, a special data reduction concept and format is presented, which offers the possibility to store the full covariance information of experimental data in the EXFOR library and provides the information required to perform a full covariance evaluation.
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials
Diaz-Ordaz, Karla; Bartlett, Jonathan W
2016-01-01
Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885
To Covary or Not to Covary, That is the Question
NASA Astrophysics Data System (ADS)
Oehlert, A. M.; Swart, P. K.
2016-12-01
The meaning of covariation between the δ13C values of carbonate carbon and that of organic material is classically interpreted as reflecting original variations in the δ13C values of the dissolved inorganic carbon in the depositional environment. However, recently it has been shown by the examination of a core from Great Bahama Bank (Clino) that during exposure not only do the rocks become altered acquiring a negative δ13C value, but at the same time terrestrial vegetation adds organic carbon to the system masking the original marine values. These processes yield a strong positive covariation between δ13Corg and δ13Ccar values even though the signals are clearly not original and unrelated to the marine δ13C values. Examining the correlation between the organic and inorganic system in a stratigraphic sense at Clino and in a second more proximally located core (Unda) using a windowed correlation coefficient technique reveals that the correlation is even more complex. Changes in slope and the magnitude of the correlation are associated with exposure surfaces, facies changes, dolomitized bodies, and non-depositional surfaces. Finally other isotopic systems such as the δ13C value of specific organic compounds as well as δ15N values of bulk and individual compounds can provide additional information. In the case of δ15N values, decreases reflect a changes in the influence of terrestrial organic material and an increase contribution of organic material from the platform surface where the main source of nitrogen is derived from the activities of cyanobacteria.
COVARIANCE ASSISTED SCREENING AND ESTIMATION
Ke, By Tracy; Jin, Jiashun; Fan, Jianqing
2014-01-01
Consider a linear model Y = X β + z, where X = Xn,p and z ~ N(0, In). The vector β is unknown and it is of interest to separate its nonzero coordinates from the zero ones (i.e., variable selection). Motivated by examples in long-memory time series (Fan and Yao, 2003) and the change-point problem (Bhattacharya, 1994), we are primarily interested in the case where the Gram matrix G = X′X is non-sparse but sparsifiable by a finite order linear filter. We focus on the regime where signals are both rare and weak so that successful variable selection is very challenging but is still possible. We approach this problem by a new procedure called the Covariance Assisted Screening and Estimation (CASE). CASE first uses a linear filtering to reduce the original setting to a new regression model where the corresponding Gram (covariance) matrix is sparse. The new covariance matrix induces a sparse graph, which guides us to conduct multivariate screening without visiting all the submodels. By interacting with the signal sparsity, the graph enables us to decompose the original problem into many separated small-size subproblems (if only we know where they are!). Linear filtering also induces a so-called problem of information leakage, which can be overcome by the newly introduced patching technique. Together, these give rise to CASE, which is a two-stage Screen and Clean (Fan and Song, 2010; Wasserman and Roeder, 2009) procedure, where we first identify candidates of these submodels by patching and screening, and then re-examine each candidate to remove false positives. For any procedure β̂ for variable selection, we measure the performance by the minimax Hamming distance between the sign vectors of β̂ and β. We show that in a broad class of situations where the Gram matrix is non-sparse but sparsifiable, CASE achieves the optimal rate of convergence. The results are successfully applied to long-memory time series and the change-point model. PMID:25541567
Frailty models with missing covariates.
Herring, Amy H; Ibrahim, Joseph G; Lipsitz, Stuart R
2002-03-01
We present a method for estimating the parameters in random effects models for survival data when covariates are subject to missingness. Our method is more general than the usual frailty model as it accommodates a wide range of distributions for the random effects, which are included as an offset in the linear predictor in a manner analogous to that used in generalized linear mixed models. We propose using a Monte Carlo EM algorithm along with the Gibbs sampler to obtain parameter estimates. This method is useful in reducing the bias that may be incurred using complete-case methods in this setting. The methodology is applied to data from Eastern Cooperative Oncology Group melanoma clinical trials in which observations were believed to be clustered and several tumor characteristics were not always observed.
Covariant diagrams for one-loop matching
Zhang, Zhengkang
2017-05-30
Here, we present a diagrammatic formulation of recently-revived covariant functional approaches to one-loop matching from an ultraviolet (UV) theory to a low-energy effective fi eld theory. Various terms following from a covariant derivative expansion (CDE) are represented by diagrams which, unlike conventional Feynman diagrams, involve gauge-covariant quantities and are thus dubbed "covariant diagrams." The use of covariant diagrams helps organize and simplify one-loop matching calculations, which we illustrate with examples. Of particular interest is the derivation of UV model-independent universal results, which reduce matching calculations of specifi c UV models to applications of master formulas. We also show how suchmore » derivation can be done in a more concise manner than the previous literature, and discuss how additional structures that are not directly captured by existing universal results, including mixed heavy-light loops, open covariant derivatives, and mixed statistics, can be easily accounted for.« less
Covariant diagrams for one-loop matching
NASA Astrophysics Data System (ADS)
Zhang, Zhengkang
2017-05-01
We present a diagrammatic formulation of recently-revived covariant functional approaches to one-loop matching from an ultraviolet (UV) theory to a low-energy effective field theory. Various terms following from a covariant derivative expansion (CDE) are represented by diagrams which, unlike conventional Feynman diagrams, involve gauge-covariant quantities and are thus dubbed "covariant diagrams." The use of covariant diagrams helps organize and simplify one-loop matching calculations, which we illustrate with examples. Of particular interest is the derivation of UV model-independent universal results, which reduce matching calculations of specific UV models to applications of master formulas. We show how such derivation can be done in a more concise manner than the previous literature, and discuss how additional structures that are not directly captured by existing universal results, including mixed heavy-light loops, open covariant derivatives, and mixed statistics, can be easily accounted for.
Shrinkage approach for EEG covariance matrix estimation.
Beltrachini, Leandro; von Ellenrieder, Nicolas; Muravchik, Carlos H
2010-01-01
We present a shrinkage estimator for the EEG spatial covariance matrix of the background activity. We show that such an estimator has some advantages over the maximum likelihood and sample covariance estimators when the number of available data to carry out the estimation is low. We find sufficient conditions for the consistency of the shrinkage estimators and results concerning their numerical stability. We compare several shrinkage schemes and show how to improve the estimator by incorporating known structure of the covariance matrix.
ANL Critical Assembly Covariance Matrix Generation - Addendum
McKnight, Richard D.; Grimm, Karl N.
2014-01-13
In March 2012, a report was issued on covariance matrices for Argonne National Laboratory (ANL) critical experiments. That report detailed the theory behind the calculation of covariance matrices and the methodology used to determine the matrices for a set of 33 ANL experimental set-ups. Since that time, three new experiments have been evaluated and approved. This report essentially updates the previous report by adding in these new experiments to the preceding covariance matrix structure.
Balancing continuous covariates based on Kernel densities.
Ma, Zhenjun; Hu, Feifang
2013-03-01
The balance of important baseline covariates is essential for convincing treatment comparisons. Stratified permuted block design and minimization are the two most commonly used balancing strategies, both of which require the covariates to be discrete. Continuous covariates are typically discretized in order to be included in the randomization scheme. But breakdown of continuous covariates into subcategories often changes the nature of the covariates and makes distributional balance unattainable. In this article, we propose to balance continuous covariates based on Kernel density estimations, which keeps the continuity of the covariates. Simulation studies show that the proposed Kernel-Minimization can achieve distributional balance of both continuous and categorical covariates, while also keeping the group size well balanced. It is also shown that the Kernel-Minimization is less predictable than stratified permuted block design and minimization. Finally, we apply the proposed method to redesign the NINDS trial, which has been a source of controversy due to imbalance of continuous baseline covariates. Simulation shows that imbalances such as those observed in the NINDS trial can be generally avoided through the implementation of the new method. Copyright © 2012 Elsevier Inc. All rights reserved.
Lorentz covariant {kappa}-Minkowski spacetime
DaPbrowski, Ludwik; Godlinski, Michal; Piacitelli, Gherardo
2010-06-15
In recent years, different views on the interpretation of Lorentz covariance of noncommuting coordinates have been discussed. By a general procedure, we construct the minimal canonical central covariantization of the {kappa}-Minkowski spacetime. Here, undeformed Lorentz covariance is implemented by unitary operators, in the presence of two dimensionful parameters. We then show that, though the usual {kappa}-Minkowski spacetime is covariant under deformed (or twisted) Lorentz action, the resulting framework is equivalent to taking a noncovariant restriction of the covariantized model. We conclude with some general comments on the approach of deformed covariance.
Understanding Past Population Dynamics: Bayesian Coalescent-Based Modeling with Covariates.
Gill, Mandev S; Lemey, Philippe; Bennett, Shannon N; Biek, Roman; Suchard, Marc A
2016-11-01
Effective population size characterizes the genetic variability in a population and is a parameter of paramount importance in population genetics and evolutionary biology. Kingman's coalescent process enables inference of past population dynamics directly from molecular sequence data, and researchers have developed a number of flexible coalescent-based models for Bayesian nonparametric estimation of the effective population size as a function of time. Major goals of demographic reconstruction include identifying driving factors of effective population size, and understanding the association between the effective population size and such factors. Building upon Bayesian nonparametric coalescent-based approaches, we introduce a flexible framework that incorporates time-varying covariates that exploit Gaussian Markov random fields to achieve temporal smoothing of effective population size trajectories. To approximate the posterior distribution, we adapt efficient Markov chain Monte Carlo algorithms designed for highly structured Gaussian models. Incorporating covariates into the demographic inference framework enables the modeling of associations between the effective population size and covariates while accounting for uncertainty in population histories. Furthermore, it can lead to more precise estimates of population dynamics. We apply our model to four examples. We reconstruct the demographic history of raccoon rabies in North America and find a significant association with the spatiotemporal spread of the outbreak. Next, we examine the effective population size trajectory of the DENV-4 virus in Puerto Rico along with viral isolate count data and find similar cyclic patterns. We compare the population history of the HIV-1 CRF02_AG clade in Cameroon with HIV incidence and prevalence data and find that the effective population size is more reflective of incidence rate. Finally, we explore the hypothesis that the population dynamics of musk ox during the Late
Non-parametric morphologies of mergers in the Illustris simulation
NASA Astrophysics Data System (ADS)
Bignone, L. A.; Tissera, P. B.; Sillero, E.; Pedrosa, S. E.; Pellizza, L. J.; Lambas, D. G.
2017-02-01
We study non-parametric morphologies of mergers events in a cosmological context, using the Illustris project. We produce mock g-band images comparable to observational surveys from the publicly available Illustris simulation idealized mock images at z = 0. We then measure non-parametric indicators: asymmetry, Gini, M20, clumpiness, and concentration for a set of galaxies with M* > 1010 M⊙. We correlate these automatic statistics with the recent merger history of galaxies and with the presence of close companions. Our main contribution is to assess in a cosmological framework, the empirically derived non-parametric demarcation line and average time-scales used to determine the merger rate observationally. We found that 98 per cent of galaxies above the demarcation line have a close companion or have experienced a recent merger event. On average, merger signatures obtained from the G-M20 criterion anti-correlate clearly with the elapsing time to the last merger event. We also find that the asymmetry correlates with galaxy pair separation and relative velocity, exhibiting the larger enhancements for those systems with pair separations d < 50 h-1 kpc and relative velocities V < 350 km s-1. We find that the G-M20 is most sensitive to recent mergers (∼0.14 Gyr) and to ongoing mergers with stellar mass ratios greater than 0.1. For this indicator, we compute a merger average observability time-scale of ∼0.2 Gyr, in agreement with previous results and demonstrate that the morphologically derived merger rate recovers the intrinsic total merger rate of the simulation and the merger rate as a function of stellar mass.
a Multivariate Downscaling Model for Nonparametric Simulation of Daily Flows
NASA Astrophysics Data System (ADS)
Molina, J. M.; Ramirez, J. A.; Raff, D. A.
2011-12-01
A multivariate, stochastic nonparametric framework for stepwise disaggregation of seasonal runoff volumes to daily streamflow is presented. The downscaling process is conditional on volumes of spring runoff and large-scale ocean-atmosphere teleconnections and includes a two-level cascade scheme: seasonal-to-monthly disaggregation first followed by monthly-to-daily disaggregation. The non-parametric and assumption-free character of the framework allows consideration of the random nature and nonlinearities of daily flows, which parametric models are unable to account for adequately. This paper examines statistical links between decadal/interannual climatic variations in the Pacific Ocean and hydrologic variability in US northwest region, and includes a periodicity analysis of climate patterns to detect coherences of their cyclic behavior in the frequency domain. We explore the use of such relationships and selected signals (e.g., north Pacific gyre oscillation, southern oscillation, and Pacific decadal oscillation indices, NPGO, SOI and PDO, respectively) in the proposed data-driven framework by means of a combinatorial approach with the aim of simulating improved streamflow sequences when compared with disaggregated series generated from flows alone. A nearest neighbor time series bootstrapping approach is integrated with principal component analysis to resample from the empirical multivariate distribution. A volume-dependent scaling transformation is implemented to guarantee the summability condition. In addition, we present a new and simple algorithm, based on nonparametric resampling, that overcomes the common limitation of lack of preservation of historical correlation between daily flows across months. The downscaling framework presented here is parsimonious in parameters and model assumptions, does not generate negative values, and produces synthetic series that are statistically indistinguishable from the observations. We present evidence showing that both
Non-parametric estimators of a monotonic dose-response curve and bootstrap confidence intervals.
Dilleen, Maria; Heimann, Günter; Hirsch, Ian
2003-03-30
In this paper we consider study designs which include a placebo and an active control group as well as several dose groups of a new drug. A monotonically increasing dose-response function is assumed, and the objective is to estimate a dose with equivalent response to the active control group, including a confidence interval for this dose. We present different non-parametric methods to estimate the monotonic dose-response curve. These are derived from the isotonic regression estimator, a non-negative least squares estimator, and a bias adjusted non-negative least squares estimator using linear interpolation. The different confidence intervals are based upon an approach described by Korn, and upon two different bootstrap approaches. One of these bootstrap approaches is standard, and the second ensures that resampling is done from empiric distributions which comply with the order restrictions imposed. In our simulations we did not find any differences between the two bootstrap methods, and both clearly outperform Korn's confidence intervals. The non-negative least squares estimator yields biased results for moderate sample sizes. The bias adjustment for this estimator works well, even for small and moderate sample sizes, and surprisingly outperforms the isotonic regression method in certain situations.
Mixed LICORS: A Nonparametric Algorithm for Predictive State Reconstruction
Goerg, Georg M.; Shalizi, Cosma Rohilla
2015-01-01
We introduce mixed LICORS, an algorithm for learning nonlinear, high-dimensional dynamics from spatio-temporal data, suitable for both prediction and simulation. Mixed LICORS extends the recent LICORS algorithm (Goerg and Shalizi, 2012) from hard clustering of predictive distributions to a non-parametric, EM-like soft clustering. This retains the asymptotic predictive optimality of LICORS, but, as we show in simulations, greatly improves out-of-sample forecasts with limited data. The new method is implemented in the publicly-available R package LICORS. PMID:26279743
Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure
Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas
2015-01-01
Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014
Computation of nonparametric convex hazard estimators via profile methods
Jankowski, Hanna K.; Wellner, Jon A.
2010-01-01
This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females. PMID:20300560
Reading a research article part II: parametric and nonparametric statistics.
Oliver, Dana; Mahon, Suzanne M
2005-04-01
Researchers often try to use a randomization technique in an attempt to reduce bias and ensure that treatment and control groups are as similar as possible. This article has provided an overview of how researchers might use parametric and nonparametric statistics when analyzing data and looking for differences between groups. Researchers must consider the types of data and choose the tests that are appropriate for the variable types to draw appropriate conclusions. The next article in this series will address comparison of more than two groups and repeated measures and other design issues.
Nonparametric maximum likelihood estimation for the multisample Wicksell corpuscle problem
Chan, Kwun Chuen Gary; Qin, Jing
2016-01-01
We study nonparametric maximum likelihood estimation for the distribution of spherical radii using samples containing a mixture of one-dimensional, two-dimensional biased and three-dimensional unbiased observations. Since direct maximization of the likelihood function is intractable, we propose an expectation-maximization algorithm for implementing the estimator, which handles an indirect measurement problem and a sampling bias problem separately in the E- and M-steps, and circumvents the need to solve an Abel-type integral equation, which creates numerical instability in the one-sample problem. Extensions to ellipsoids are studied and connections to multiplicative censoring are discussed. PMID:27279657
NASA Astrophysics Data System (ADS)
Shirasaki, Masato; Takada, Masahiro; Miyatake, Hironao; Takahashi, Ryuichi; Hamana, Takashi; Nishimichi, Takahiro; Murata, Ryoma
2017-09-01
We develop a method to simulate galaxy-galaxy weak lensing by utilizing all-sky, light-cone simulations and their inherent halo catalogues. Using the mock catalogue to study the error covariance matrix of galaxy-galaxy weak lensing, we compare the full covariance with the 'jackknife' (JK) covariance, the method often used in the literature that estimates the covariance from the resamples of the data itself. We show that there exists the variation of JK covariance over realizations of mock lensing measurements, while the average JK covariance over mocks can give a reasonably accurate estimation of the true covariance up to separations comparable with the size of JK subregion. The scatter in JK covariances is found to be ∼10 per cent after we subtract the lensing measurement around random points. However, the JK method tends to underestimate the covariance at the larger separations, more increasingly for a survey with a higher number density of source galaxies. We apply our method to the Sloan Digital Sky Survey (SDSS) data, and show that the 48 mock SDSS catalogues nicely reproduce the signals and the JK covariance measured from the real data. We then argue that the use of the accurate covariance, compared to the JK covariance, allows us to use the lensing signals at large scales beyond a size of the JK subregion, which contains cleaner cosmological information in the linear regime.
Group Theory of Covariant Harmonic Oscillators
ERIC Educational Resources Information Center
Kim, Y. S.; Noz, Marilyn E.
1978-01-01
A simple and concrete example for illustrating the properties of noncompact groups is presented. The example is based on the covariant harmonic-oscillator formalism in which the relativistic wave functions carry a covariant-probability interpretation. This can be used in a group theory course for graduate students who have some background in…
Quality Quantification of Evaluated Cross Section Covariances
Varet, S.; Dossantos-Uzarralde, P.
2015-01-15
Presently, several methods are used to estimate the covariance matrix of evaluated nuclear cross sections. Because the resulting covariance matrices can be different according to the method used and according to the assumptions of the method, we propose a general and objective approach to quantify the quality of the covariance estimation for evaluated cross sections. The first step consists in defining an objective criterion. The second step is computation of the criterion. In this paper the Kullback-Leibler distance is proposed for the quality quantification of a covariance matrix estimation and its inverse. It is based on the distance to the true covariance matrix. A method based on the bootstrap is presented for the estimation of this criterion, which can be applied with most methods for covariance matrix estimation and without the knowledge of the true covariance matrix. The full approach is illustrated on the {sup 85}Rb nucleus evaluations and the results are then used for a discussion on scoring and Monte Carlo approaches for covariance matrix estimation of the cross section evaluations.
Covariance Structure Analysis of Ordinal Ipsative Data.
ERIC Educational Resources Information Center
Chan, Wai; Bentler, Peter M.
1998-01-01
Proposes a two-stage estimation method for the analysis of covariance structure models with ordinal ipsative data (OID). A goodness-of-fit statistic is given for testing the hypothesized covariance structure matrix, and simulation results show that the method works well with a large sample. (SLD)
Group Theory of Covariant Harmonic Oscillators
ERIC Educational Resources Information Center
Kim, Y. S.; Noz, Marilyn E.
1978-01-01
A simple and concrete example for illustrating the properties of noncompact groups is presented. The example is based on the covariant harmonic-oscillator formalism in which the relativistic wave functions carry a covariant-probability interpretation. This can be used in a group theory course for graduate students who have some background in…
Position Error Covariance Matrix Validation and Correction
NASA Technical Reports Server (NTRS)
Frisbee, Joe, Jr.
2016-01-01
In order to calculate operationally accurate collision probabilities, the position error covariance matrices predicted at times of closest approach must be sufficiently accurate representations of the position uncertainties. This presentation will discuss why the Gaussian distribution is a reasonable expectation for the position uncertainty and how this assumed distribution type is used in the validation and correction of position error covariance matrices.
Bayesian nonparametric centered random effects models with variable selection.
Yang, Mingan
2013-03-01
In a linear mixed effects model, it is common practice to assume that the random effects follow a parametric distribution such as a normal distribution with mean zero. However, in the case of variable selection, substantial violation of the normality assumption can potentially impact the subset selection and result in poor interpretation and even incorrect results. In nonparametric random effects models, the random effects generally have a nonzero mean, which causes an identifiability problem for the fixed effects that are paired with the random effects. In this article, we focus on a Bayesian method for variable selection. We characterize the subject-specific random effects nonparametrically with a Dirichlet process and resolve the bias simultaneously. In particular, we propose flexible modeling of the conditional distribution of the random effects with changes across the predictor space. The approach is implemented using a stochastic search Gibbs sampler to identify subsets of fixed effects and random effects to be included in the model. Simulations are provided to evaluate and compare the performance of our approach to the existing ones. We then apply the new approach to a real data example, cross-country and interlaboratory rodent uterotrophic bioassay.
A High-Dimensional Nonparametric Multivariate Test for Mean Vector.
Wang, Lan; Peng, Bo; Li, Runze
This work is concerned with testing the population mean vector of nonnormal high-dimensional multivariate data. Several tests for high-dimensional mean vector, based on modifying the classical Hotelling T(2) test, have been proposed in the literature. Despite their usefulness, they tend to have unsatisfactory power performance for heavy-tailed multivariate data, which frequently arise in genomics and quantitative finance. This paper proposes a novel high-dimensional nonparametric test for the population mean vector for a general class of multivariate distributions. With the aid of new tools in modern probability theory, we proved that the limiting null distribution of the proposed test is normal under mild conditions when p is substantially larger than n. We further study the local power of the proposed test and compare its relative efficiency with a modified Hotelling T(2) test for high-dimensional data. An interesting finding is that the newly proposed test can have even more substantial power gain with large p than the traditional nonparametric multivariate test does with finite fixed p. We study the finite sample performance of the proposed test via Monte Carlo simulations. We further illustrate its application by an empirical analysis of a genomics data set.
Stochastic Earthquake Rupture Modeling Using Nonparametric Co-Regionalization
NASA Astrophysics Data System (ADS)
Lee, Kyungbook; Song, Seok Goo
2016-10-01
Accurate predictions of the intensity and variability of ground motions are essential in simulation-based seismic hazard assessment. Advanced simulation-based ground motion prediction methods have been proposed to complement the empirical approach, which suffers from the lack of observed ground motion data, especially in the near-source region for large events. It is important to quantify the variability of the earthquake rupture process for future events and to produce a number of rupture scenario models to capture the variability in simulation-based ground motion predictions. In this study, we improved the previously developed stochastic earthquake rupture modeling method by applying the nonparametric co-regionalization, which was proposed in geostatistics, to the correlation models estimated from dynamically derived earthquake rupture models. The nonparametric approach adopted in this study is computationally efficient and, therefore, enables us to simulate numerous rupture scenarios, including large events (M > 7.0). It also gives us an opportunity to check the shape of true input correlation models in stochastic modeling after being deformed for permissibility. We expect that this type of modeling will improve our ability to simulate a wide range of rupture scenario models and thereby predict ground motions and perform seismic hazard assessment more accurately.
Bayesian nonparametric dictionary learning for compressed sensing MRI.
Huang, Yue; Paisley, John; Lin, Qin; Ding, Xinghao; Fu, Xueyang; Zhang, Xiao-Ping
2014-12-01
We develop a Bayesian nonparametric model for reconstructing magnetic resonance images (MRIs) from highly undersampled k -space data. We perform dictionary learning as part of the image reconstruction process. To this end, we use the beta process as a nonparametric dictionary learning prior for representing an image patch as a sparse combination of dictionary elements. The size of the dictionary and patch-specific sparsity pattern are inferred from the data, in addition to other dictionary learning variables. Dictionary learning is performed directly on the compressed image, and so is tailored to the MRI being considered. In addition, we investigate a total variation penalty term in combination with the dictionary learning model, and show how the denoising property of dictionary learning removes dependence on regularization parameters in the noisy setting. We derive a stochastic optimization algorithm based on Markov chain Monte Carlo for the Bayesian model, and use the alternating direction method of multipliers for efficiently performing total variation minimization. We present empirical results on several MRI, which show that the proposed regularization framework can improve reconstruction accuracy over other methods.
A comparative study of nonparametric methods for pattern recognition
NASA Technical Reports Server (NTRS)
Hahn, S. F.; Nelson, G. D.
1972-01-01
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal.
A High-Dimensional Nonparametric Multivariate Test for Mean Vector
Wang, Lan; Peng, Bo; Li, Runze
2015-01-01
This work is concerned with testing the population mean vector of nonnormal high-dimensional multivariate data. Several tests for high-dimensional mean vector, based on modifying the classical Hotelling T2 test, have been proposed in the literature. Despite their usefulness, they tend to have unsatisfactory power performance for heavy-tailed multivariate data, which frequently arise in genomics and quantitative finance. This paper proposes a novel high-dimensional nonparametric test for the population mean vector for a general class of multivariate distributions. With the aid of new tools in modern probability theory, we proved that the limiting null distribution of the proposed test is normal under mild conditions when p is substantially larger than n. We further study the local power of the proposed test and compare its relative efficiency with a modified Hotelling T2 test for high-dimensional data. An interesting finding is that the newly proposed test can have even more substantial power gain with large p than the traditional nonparametric multivariate test does with finite fixed p. We study the finite sample performance of the proposed test via Monte Carlo simulations. We further illustrate its application by an empirical analysis of a genomics data set. PMID:26848205
Adjoints and Low-rank Covariance Representation
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; Cohn, Stephen E.
2000-01-01
Quantitative measures of the uncertainty of Earth System estimates can be as important as the estimates themselves. Second moments of estimation errors are described by the covariance matrix, whose direct calculation is impractical when the number of degrees of freedom of the system state is large. Ensemble and reduced-state approaches to prediction and data assimilation replace full estimation error covariance matrices by low-rank approximations. The appropriateness of such approximations depends on the spectrum of the full error covariance matrix, whose calculation is also often impractical. Here we examine the situation where the error covariance is a linear transformation of a forcing error covariance. We use operator norms and adjoints to relate the appropriateness of low-rank representations to the conditioning of this transformation. The analysis is used to investigate low-rank representations of the steady-state response to random forcing of an idealized discrete-time dynamical system.
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
On the ambiguity of interaction and nonlinear main effects in a regime of dependent covariates.
Matuschek, Hannes; Kliegl, Reinhold
2017-09-15
The analysis of large experimental datasets frequently reveals significant interactions that are difficult to interpret within the theoretical framework guiding the research. Some of these interactions actually arise from the presence of unspecified nonlinear main effects and statistically dependent covariates in the statistical model. Importantly, such nonlinear main effects may be compatible (or, at least, not incompatible) with the current theoretical framework. In the present literature, this issue has only been studied in terms of correlated (linearly dependent) covariates. Here we generalize to nonlinear main effects (i.e., main effects of arbitrary shape) and dependent covariates. We propose a novel nonparametric method to test for ambiguous interactions where present parametric methods fail. We illustrate the method with a set of simulations and with reanalyses (a) of effects of parental education on their children's educational expectations and (b) of effects of word properties on fixation locations during reading of natural sentences, specifically of effects of length and morphological complexity of the word to be fixated next. The resolution of such ambiguities facilitates theoretical progress.
Johnson, H.O.; Gupta, S.C.; Vecchia, A.V.; Zvomuya, F.
2009-01-01
Excessive loading of sediment and nutrients to rivers is a major problem in many parts of the United States. In this study, we tested the non-parametric Seasonal Kendall (SEAKEN) trend model and the parametric USGS Quality of Water trend program (QWTREND) to quantify trends in water quality of the Minnesota River at Fort Snelling from 1976 to 2003. Both methods indicated decreasing trends in flow-adjusted concentrations of total suspended solids (TSS), total phosphorus (TP), and orthophosphorus (OP) and a generally increasing trend in flow-adjusted nitrate plus nitrite-nitrogen (NO3-N) concentration. The SEAKEN results were strongly influenced by the length of the record as well as extreme years (dry or wet) earlier in the record. The QWTREND results, though influenced somewhat by the same factors, were more stable. The magnitudes of trends between the two methods were somewhat different and appeared to be associated with conceptual differences between the flow-adjustment processes used and with data processing methods. The decreasing trends in TSS, TP, and OP concentrations are likely related to conservation measures implemented in the basin. However, dilution effects from wet climate or additional tile drainage cannot be ruled out. The increasing trend in NO3-N concentrations was likely due to increased drainage in the basin. Since the Minnesota River is the main source of sediments to the Mississippi River, this study also addressed the rapid filling of Lake Pepin on the Mississippi River and found the likely cause to be increased flow due to recent wet climate in the region. Copyright ?? 2009 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved.
Concordance between criteria for covariate model building.
Hennig, Stefanie; Karlsson, Mats O
2014-04-01
When performing a population pharmacokinetic modelling analysis covariates are often added to the model. Such additions are often justified by improved goodness of fit and/or decreased in unexplained (random) parameter variability. Increased goodness of fit is most commonly measured by the decrease in the objective function value. Parameter variability can be defined as the sum of unexplained (random) and explained (predictable) variability. Increase in magnitude of explained parameter variability could be another possible criterion for judging improvement in the model. The agreement between these three criteria in diagnosing covariate-parameter relationships of different strengths and nature using stochastic simulations and estimations as well as assessing covariate-parameter relationships in four previously published real data examples were explored. Total estimated parameter variability was found to vary with the number of covariates introduced on the parameter. In the simulated examples and two real examples, the parameter variability increased with increasing number of included covariates. For the other real examples parameter variability decreased or did not change systematically with the addition of covariates. The three criteria were highly correlated, with the decrease in unexplained variability being more closely associated with changes in objective function values than increases in explained parameter variability were. The often used assumption that inclusion of covariates in models only shifts unexplained parameter variability to explained parameter variability appears not to be true, which may have implications for modelling decisions.
Kleinman, Ken; Gillman, Matthew W.
2014-01-01
We implemented 6 confounding adjustment methods: 1) covariate-adjusted regression, 2) propensity score (PS) regression, 3) PS stratification, 4) PS matching with two calipers, 5) inverse-probability-weighting, and 6) doubly-robust estimation to examine the associations between the BMI z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding versus formula only (N = 437) and cesarean section versus vaginal delivery (N = 1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were −0.33 (95% CI −0.53, −0.13) and −0.24 (−0.46, −0.02), respectively. The other approaches resulted in smaller N (204 to 276) because of poor overlap of covariates, but CIs were of similar width except for inverse-probability-weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from −0.01 to −0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method. PMID:25171142
Li, L; Kleinman, K; Gillman, M W
2014-12-01
We implemented six confounding adjustment methods: (1) covariate-adjusted regression, (2) propensity score (PS) regression, (3) PS stratification, (4) PS matching with two calipers, (5) inverse probability weighting and (6) doubly robust estimation to examine the associations between the body mass index (BMI) z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding v. formula only (n=437) and cesarean section v. vaginal delivery (n=1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were -0.33 (95% CI -0.53, -0.13) and -0.24 (-0.46, -0.02), respectively. The other approaches resulted in smaller n (204-276) because of poor overlap of covariates, but CIs were of similar width except for inverse probability weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from -0.01 to -0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method.
Combined Use of Integral Experiments and Covariance Data
NASA Astrophysics Data System (ADS)
Palmiotti, G.; Salvatores, M.; Aliberti, G.; Herman, M.; Hoblit, S. D.; McKnight, R. D.; Obložinský, P.; Talou, P.; Hale, G. M.; Hiruta, H.; Kawano, T.; Mattoon, C. M.; Nobre, G. P. A.; Palumbo, A.; Pigni, M.; Rising, M. E.; Yang, W.-S.; Kahler, A. C.
2014-04-01
In the frame of a US-DOE sponsored project, ANL, BNL, INL and LANL have performed a joint multidisciplinary research activity in order to explore the combined use of integral experiments and covariance data with the objective to both give quantitative indications on possible improvements of the ENDF evaluated data files and to reduce at the same time crucial reactor design parameter uncertainties. Methods that have been developed in the last four decades for the purposes indicated above have been improved by some new developments that benefited also by continuous exchanges with international groups working in similar areas. The major new developments that allowed significant progress are to be found in several specific domains: a) new science-based covariance data; b) integral experiment covariance data assessment and improved experiment analysis, e.g., of sample irradiation experiments; c) sensitivity analysis, where several improvements were necessary despite the generally good understanding of these techniques, e.g., to account for fission spectrum sensitivity; d) a critical approach to the analysis of statistical adjustments performance, both a priori and a posteriori; e) generalization of the assimilation method, now applied for the first time not only to multigroup cross sections data but also to nuclear model parameters (the "consistent" method). This article describes the major results obtained in each of these areas; a large scale nuclear data adjustment, based on the use of approximately one hundred high-accuracy integral experiments, will be reported along with a significant example of the application of the new "consistent" method of data assimilation.
Reliability-based covariance control design
Field, R.V. Jr.; Bergman, L.A.
1997-03-01
An extension to classical covariance control methods, introduced by Skelton and co-workers, is proposed specifically for application to the control of civil engineering structures subjected to random dynamic excitations. The covariance structure of the system is developed directly from specification of its reliability via the assumption of independent (Poisson) outcrossings of its stationary response process from a polyhedral safe region. This leads to a set of state covariance controllers, each of which guarantees that the closed-loop system will possess the specified level of reliability. An example civil engineering structure is considered.
Conformal Covariance and the Split Property
NASA Astrophysics Data System (ADS)
Morinelli, Vincenzo; Tanimoto, Yoh; Weiner, Mihály
2017-08-01
We show that for a conformal local net of observables on the circle, the split property is automatic. Both full conformal covariance (i.e., diffeomorphism covariance) and the circle-setting play essential roles in this fact, while by previously constructed examples it was already known that even on the circle, Möbius covariance does not imply the split property. On the other hand, here we also provide an example of a local conformal net living on the 2-dimensional Minkowski space, which—although being diffeomorphism covariant—does not have the split property.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected
Petersen estimator, Chapman adjustment, list effects, and heterogeneity.
Mao, Chang Xuan; Huang, Ruochen; Zhang, Sijia
2017-03-01
We use a nonparametric mixture model for the purpose of estimating the size of a population from multiple lists in which both the individual effects and list effects are allowed to vary. We propose a lower bound of the population size that admits an analytic expression. The lower bound can be estimated without the necessity of model-fitting. The asymptotical normality of the estimator is established. Both the estimator itself and that for the estimable bound of its variance are adjusted. These adjusted versions are shown to be unbiased in the limit. Simulation experiments are performed to assess the proposed approach and real applications are studied.
Nonparametric estimation of stochastic differential equations with sparse Gaussian processes
NASA Astrophysics Data System (ADS)
García, Constantino A.; Otero, Abraham; Félix, Paulo; Presedo, Jesús; Márquez, David G.
2017-08-01
The application of stochastic differential equations (SDEs) to the analysis of temporal data has attracted increasing attention, due to their ability to describe complex dynamics with physically interpretable equations. In this paper, we introduce a nonparametric method for estimating the drift and diffusion terms of SDEs from a densely observed discrete time series. The use of Gaussian processes as priors permits working directly in a function-space view and thus the inference takes place directly in this space. To cope with the computational complexity that requires the use of Gaussian processes, a sparse Gaussian process approximation is provided. This approximation permits the efficient computation of predictions for the drift and diffusion terms by using a distribution over a small subset of pseudosamples. The proposed method has been validated using both simulated data and real data from economy and paleoclimatology. The application of the method to real data demonstrates its ability to capture the behavior of complex systems.
Analyzing Information Flow in Brain Networks with Nonparametric Granger Causality
Dhamala, Mukeshwar; Rangarajan, Govindan; Ding, Mingzhou
2009-01-01
Multielectrode neurophysiological recording and high-resolution neuroimaging generate multivariate data that are the basis for understanding the patterns of neural interactions. How to extract directions of information flow in brain networks from these data remains a key challenge. Research over the last few years has identified Granger causality as a statistically principled technique to furnish this capability. The estimation of Granger causality currently requires autoregressive modeling of neural data. Here, we propose a nonparametric approach based on widely used Fourier and wavelet transforms to estimate Granger causality, eliminating the need of explicit autoregressive data modeling. We demonstrate the effectiveness of this approach by applying it to synthetic data generated by network models with known connectivity and to local field potentials recorded from monkeys performing a sensorimotor task. PMID:18394927
Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference
Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.
2015-01-01
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922
Nonparametric autocovariance estimation from censored time series by Gaussian imputation
Park, Jung Wook; Genton, Marc G.; Ghosh, Sujit K.
2009-01-01
One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic. PMID:20072705
Analyzing information flow in brain networks with nonparametric Granger causality.
Dhamala, Mukeshwar; Rangarajan, Govindan; Ding, Mingzhou
2008-06-01
Multielectrode neurophysiological recording and high-resolution neuroimaging generate multivariate data that are the basis for understanding the patterns of neural interactions. How to extract directions of information flow in brain networks from these data remains a key challenge. Research over the last few years has identified Granger causality as a statistically principled technique to furnish this capability. The estimation of Granger causality currently requires autoregressive modeling of neural data. Here, we propose a nonparametric approach based on widely used Fourier and wavelet transforms to estimate both pairwise and conditional measures of Granger causality, eliminating the need of explicit autoregressive data modeling. We demonstrate the effectiveness of this approach by applying it to synthetic data generated by network models with known connectivity and to local field potentials recorded from monkeys performing a sensorimotor task.
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions.
Park, Yongseok; Taylor, Jeremy M G; Kalbfleisch, John D
2012-06-01
In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method.
A New Powerful Nonparametric Rank Test for Ordered Alternative Problem
Shan, Guogen; Young, Daniel; Kang, Le
2014-01-01
We propose a new nonparametric test for ordered alternative problem based on the rank difference between two observations from different groups. These groups are assumed to be independent from each other. The exact mean and variance of the test statistic under the null distribution are derived, and its asymptotic distribution is proven to be normal. Furthermore, an extensive power comparison between the new test and other commonly used tests shows that the new test is generally more powerful than others under various conditions, including the same type of distribution, and mixed distributions. A real example from an anti-hypertensive drug trial is provided to illustrate the application of the tests. The new test is therefore recommended for use in practice due to easy calculation and substantial power gain. PMID:25405757
Pointwise nonparametric maximum likelihood estimator of stochastically ordered survivor functions
Park, Yongseok; Taylor, Jeremy M. G.; Kalbfleisch, John D.
2012-01-01
In this paper, we consider estimation of survivor functions from groups of observations with right-censored data when the groups are subject to a stochastic ordering constraint. Many methods and algorithms have been proposed to estimate distribution functions under such restrictions, but none have completely satisfactory properties when the observations are censored. We propose a pointwise constrained nonparametric maximum likelihood estimator, which is defined at each time t by the estimates of the survivor functions subject to constraints applied at time t only. We also propose an efficient method to obtain the estimator. The estimator of each constrained survivor function is shown to be nonincreasing in t, and its consistency and asymptotic distribution are established. A simulation study suggests better small and large sample properties than for alternative estimators. An example using prostate cancer data illustrates the method. PMID:23843661
Nonparametric supervised learning by linear interpolation with maximum entropy.
Gupta, Maya R; Gray, Robert M; Olshen, Richard A
2006-05-01
Nonparametric neighborhood methods for learning entail estimation of class conditional probabilities based on relative frequencies of samples that are "near-neighbors" of a test point. We propose and explore the behavior of a learning algorithm that uses linear interpolation and the principle of maximum entropy (LIME). We consider some theoretical properties of the LIME algorithm: LIME weights have exponential form; the estimates are consistent; and the estimates are robust to additive noise. In relation to bias reduction, we show that near-neighbors contain a test point in their convex hull asymptotically. The common linear interpolation solution used for regression on grids or look-up-tables is shown to solve a related maximum entropy problem. LIME simulation results support use of the method, and performance on a pipeline integrity classification problem demonstrates that the proposed algorithm has practical value.
Landmark Constrained Non-parametric Image Registration with Isotropic Tolerances
NASA Astrophysics Data System (ADS)
Papenberg, Nils; Olesch, Janine; Lange, Thomas; Schlag, Peter M.; Fischer, Bernd
The incorporation of additional user knowledge into a nonrigid registration process is a promising topic in modern registration schemes. The combination of intensity based registration and some interactively chosen landmark pairs is a major approach in this direction. There exist different possibilities to incorporate landmark pairs into a variational non-parametric registration framework. As the interactive localization of point landmarks is always prone to errors, a demand for precise landmark matching is bound to fail. Here, the treatment of the distances of corresponding landmarks as penalties within a constrained optimization problem offers the possibility to control the quality of the matching of each landmark pair individually. More precisely, we introduce inequality constraints, which allow for a sphere-like tolerance around each landmark. We illustrate the performance of this new approach for artificial 2D images as well as for the challenging registration of preoperative CT data to intra-operative 3D ultrasound data of the liver.
Nonparametric autocovariance estimation from censored time series by Gaussian imputation.
Park, Jung Wook; Genton, Marc G; Ghosh, Sujit K
2009-02-01
One of the most frequently used methods to model the autocovariance function of a second-order stationary time series is to use the parametric framework of autoregressive and moving average models developed by Box and Jenkins. However, such parametric models, though very flexible, may not always be adequate to model autocovariance functions with sharp changes. Furthermore, if the data do not follow the parametric model and are censored at a certain value, the estimation results may not be reliable. We develop a Gaussian imputation method to estimate an autocovariance structure via nonparametric estimation of the autocovariance function in order to address both censoring and incorrect model specification. We demonstrate the effectiveness of the technique in terms of bias and efficiency with simulations under various rates of censoring and underlying models. We describe its application to a time series of silicon concentrations in the Arctic.
Nonparametric Model of Smooth Muscle Force Production During Electrical Stimulation.
Cole, Marc; Eikenberry, Steffen; Kato, Takahide; Sandler, Roman A; Yamashiro, Stanley M; Marmarelis, Vasilis Z
2017-03-01
A nonparametric model of smooth muscle tension response to electrical stimulation was estimated using the Laguerre expansion technique of nonlinear system kernel estimation. The experimental data consisted of force responses of smooth muscle to energy-matched alternating single pulse and burst current stimuli. The burst stimuli led to at least a 10-fold increase in peak force in smooth muscle from Mytilus edulis, despite the constant energy constraint. A linear model did not fit the data. However, a second-order model fit the data accurately, so the higher-order models were not required to fit the data. Results showed that smooth muscle force response is not linearly related to the stimulation power.
A nonparametric stochastic optimizer for TDMA-based neuronal signaling.
Suzuki, Junichi; Phan, Dũng H; Budiman, Harry
2014-09-01
This paper considers neurons as a physical communication medium for intrabody networks of nano/micro-scale machines and formulates a noisy multiobjective optimization problem for a Time Division Multiple Access (TDMA) communication protocol atop the physical layer. The problem is to find the Pareto-optimal TDMA configurations that maximize communication performance (e.g., latency) by multiplexing a given neuronal network to parallelize signal transmissions while maximizing communication robustness (i.e., unlikeliness of signal interference) against noise in neuronal signaling. Using a nonparametric significance test, the proposed stochastic optimizer is designed to statistically determine the superior-inferior relationship between given two solution candidates and seek the optimal trade-offs among communication performance and robustness objectives. Simulation results show that the proposed optimizer efficiently obtains quality TDMA configurations in noisy environments and outperforms existing noise-aware stochastic optimizers.
Bayesian Nonparametric Shrinkage Applied to Cepheid Star Oscillations.
Berger, James; Jefferys, William; Müller, Peter
2012-01-01
Bayesian nonparametric regression with dependent wavelets has dual shrinkage properties: there is shrinkage through a dependent prior put on functional differences, and shrinkage through the setting of most of the wavelet coefficients to zero through Bayesian variable selection methods. The methodology can deal with unequally spaced data and is efficient because of the existence of fast moves in model space for the MCMC computation. The methodology is illustrated on the problem of modeling the oscillations of Cepheid variable stars; these are a class of pulsating variable stars with the useful property that their periods of variability are strongly correlated with their absolute luminosity. Once this relationship has been calibrated, knowledge of the period gives knowledge of the luminosity. This makes these stars useful as "standard candles" for estimating distances in the universe.
Non-parametric diffeomorphic image registration with the demons algorithm.
Vercauteren, Tom; Pennec, Xavier; Perchant, Aymeric; Ayache, Nicholas
2007-01-01
We propose a non-parametric diffeomorphic image registration algorithm based on Thirion's demons algorithm. The demons algorithm can be seen as an optimization procedure on the entire space of displacement fields. The main idea of our algorithm is to adapt this procedure to a space of diffeomorphic transformations. In contrast to many diffeomorphic registration algorithms, our solution is computationally efficient since in practice it only replaces an addition of free form deformations by a few compositions. Our experiments show that in addition to being diffeomorphic, our algorithm provides results that are similar to the ones from the demons algorithm but with transformations that are much smoother and closer to the true ones in terms of Jacobians.
Non-parametric estimation of spatial variation in relative risk.
Kelsall, J E; Diggle, P J
We consider the problem of estimating the spatial variation in relative risks of two diseases, say, over a geographical region. Using an underlying Poisson point process model, we approach the problem as one of density ratio estimation implemented with a non-parametric kernel smoothing method. In order to assess the significance of any local peaks or troughs in the estimated risk surface, we introduce pointwise tolerance contours which can enhance a greyscale image plot of the estimate. We also propose a Monte Carlo test of the null hypothesis of constant risk over the whole region, to avoid possible over-interpretation of the estimated risk surface. We illustrate the capabilities of the methodology with two epidemiological examples.
Analyzing multiple spike trains with nonparametric Granger causality.
Nedungadi, Aatira G; Rangarajan, Govindan; Jain, Neeraj; Ding, Mingzhou
2009-08-01
Simultaneous recordings of spike trains from multiple single neurons are becoming commonplace. Understanding the interaction patterns among these spike trains remains a key research area. A question of interest is the evaluation of information flow between neurons through the analysis of whether one spike train exerts causal influence on another. For continuous-valued time series data, Granger causality has proven an effective method for this purpose. However, the basis for Granger causality estimation is autoregressive data modeling, which is not directly applicable to spike trains. Various filtering options distort the properties of spike trains as point processes. Here we propose a new nonparametric approach to estimate Granger causality directly from the Fourier transforms of spike train data. We validate the method on synthetic spike trains generated by model networks of neurons with known connectivity patterns and then apply it to neurons simultaneously recorded from the thalamus and the primary somatosensory cortex of a squirrel monkey undergoing tactile stimulation.
Binary Classifier Calibration Using a Bayesian Non-Parametric Approach.
Naeini, Mahdi Pakdaman; Cooper, Gregory F; Hauskrecht, Milos
Learning probabilistic predictive models that are well calibrated is critical for many prediction and decision-making tasks in Data mining. This paper presents two new non-parametric methods for calibrating outputs of binary classification models: a method based on the Bayes optimal selection and a method based on the Bayesian model averaging. The advantage of these methods is that they are independent of the algorithm used to learn a predictive model, and they can be applied in a post-processing step, after the model is learned. This makes them applicable to a wide variety of machine learning models and methods. These calibration methods, as well as other methods, are tested on a variety of datasets in terms of both discrimination and calibration performance. The results show the methods either outperform or are comparable in performance to the state-of-the-art calibration methods.
Hyperspectral image segmentation using a cooperative nonparametric approach
NASA Astrophysics Data System (ADS)
Taher, Akar; Chehdi, Kacem; Cariou, Claude
2013-10-01
In this paper a new unsupervised nonparametric cooperative and adaptive hyperspectral image segmentation approach is presented. The hyperspectral images are partitioned band by band in parallel and intermediate classification results are evaluated and fused, to get the final segmentation result. Two unsupervised nonparametric segmentation methods are used in parallel cooperation, namely the Fuzzy C-means (FCM) method, and the Linde-Buzo-Gray (LBG) algorithm, to segment each band of the image. The originality of the approach relies firstly on its local adaptation to the type of regions in an image (textured, non-textured), and secondly on the introduction of several levels of evaluation and validation of intermediate segmentation results before obtaining the final partitioning of the image. For the management of similar or conflicting results issued from the two classification methods, we gradually introduced various assessment steps that exploit the information of each spectral band and its adjacent bands, and finally the information of all the spectral bands. In our approach, the detected textured and non-textured regions are treated separately from feature extraction step, up to the final classification results. This approach was first evaluated on a large number of monocomponent images constructed from the Brodatz album. Then it was evaluated on two real applications using a respectively multispectral image for Cedar trees detection in the region of Baabdat (Lebanon) and a hyperspectral image for identification of invasive and non invasive vegetation in the region of Cieza (Spain). A correct classification rate (CCR) for the first application is over 97% and for the second application the average correct classification rate (ACCR) is over 99%.
Non-parametric transient classification using adaptive wavelets
NASA Astrophysics Data System (ADS)
Varughese, Melvin M.; von Sachs, Rainer; Stephanou, Michael; Bassett, Bruce A.
2015-11-01
Classifying transients based on multiband light curves is a challenging but crucial problem in the era of GAIA and Large Synoptic Sky Telescope since the sheer volume of transients will make spectroscopic classification unfeasible. We present a non-parametric classifier that predicts the transient's class given training data. It implements two novel components: the use of the BAGIDIS wavelet methodology - a characterization of functional data using hierarchical wavelet coefficients - as well as the introduction of a ranked probability classifier on the wavelet coefficients that handles both the heteroscedasticity of the data in addition to the potential non-representativity of the training set. The classifier is simple to implement while a major advantage of the BAGIDIS wavelets is that they are translation invariant. Hence, BAGIDIS does not need the light curves to be aligned to extract features. Further, BAGIDIS is non-parametric so it can be used effectively in blind searches for new objects. We demonstrate the effectiveness of our classifier against the Supernova Photometric Classification Challenge to correctly classify supernova light curves as Type Ia or non-Ia. We train our classifier on the spectroscopically confirmed subsample (which is not representative) and show that it works well for supernova with observed light-curve time spans greater than 100 d (roughly 55 per cent of the data set). For such data, we obtain a Ia efficiency of 80.5 per cent and a purity of 82.4 per cent, yielding a highly competitive challenge score of 0.49. This indicates that our `model-blind' approach may be particularly suitable for the general classification of astronomical transients in the era of large synoptic sky surveys.
Song, Dong; Wang, Zhuo; Marmarelis, Vasilis Z; Berger, Theodore W
2009-02-01
This paper presents a synergistic parametric and non-parametric modeling study of short-term plasticity (STP) in the Schaffer collateral to hippocampal CA1 pyramidal neuron (SC) synapse. Parametric models in the form of sets of differential and algebraic equations have been proposed on the basis of the current understanding of biological mechanisms active within the system. Non-parametric Poisson-Volterra models are obtained herein from broadband experimental input-output data. The non-parametric model is shown to provide better prediction of the experimental output than a parametric model with a single set of facilitation/depression (FD) process. The parametric model is then validated in terms of its input-output transformational properties using the non-parametric model since the latter constitutes a canonical and more complete representation of the synaptic nonlinear dynamics. Furthermore, discrepancies between the experimentally-derived non-parametric model and the equivalent non-parametric model of the parametric model suggest the presence of multiple FD processes in the SC synapses. Inclusion of an additional set of FD process in the parametric model makes it replicate better the characteristics of the experimentally-derived non-parametric model. This improved parametric model in turn provides the requisite biological interpretability that the non-parametric model lacks.
ERIC Educational Resources Information Center
Lee, Young-Sun; Wollack, James A.; Douglas, Jeffrey
2009-01-01
The purpose of this study was to assess the model fit of a 2PL through comparison with the nonparametric item characteristic curve (ICC) estimation procedures. Results indicate that three nonparametric procedures implemented produced ICCs that are similar to that of the 2PL for items simulated to fit the 2PL. However for misfitting items,…
Out-of-Sample Extensions for Non-Parametric Kernel Methods.
Pan, Binbin; Chen, Wen-Sheng; Chen, Bo; Xu, Chen; Lai, Jianhuang
2017-02-01
Choosing suitable kernels plays an important role in the performance of kernel methods. Recently, a number of studies were devoted to developing nonparametric kernels. Without assuming any parametric form of the target kernel, nonparametric kernel learning offers a flexible scheme to utilize the information of the data, which may potentially characterize the data similarity better. The kernel methods using nonparametric kernels are referred to as nonparametric kernel methods. However, many nonparametric kernel methods are restricted to transductive learning, where the prediction function is defined only over the data points given beforehand. They have no straightforward extension for the out-of-sample data points, and thus cannot be applied to inductive learning. In this paper, we show how to make the nonparametric kernel methods applicable to inductive learning. The key problem of out-of-sample extension is how to extend the nonparametric kernel matrix to the corresponding kernel function. A regression approach in the hyper reproducing kernel Hilbert space is proposed to solve this problem. Empirical results indicate that the out-of-sample performance is comparable to the in-sample performance in most cases. Experiments on face recognition demonstrate the superiority of our nonparametric kernel method over the state-of-the-art parametric kernel methods.
Gosho, Masahiko; Hirakawa, Akihiro; Noma, Hisashi; Maruo, Kazushi; Sato, Yasunori
2015-08-11
In longitudinal clinical trials, some subjects will drop out before completing the trial, so their measurements towards the end of the trial are not obtained. Mixed-effects models for repeated measures (MMRM) analysis with "unstructured" (UN) covariance structure are increasingly common as a primary analysis for group comparisons in these trials. Furthermore, model-based covariance estimators have been routinely used for testing the group difference and estimating confidence intervals of the difference in the MMRM analysis using the UN covariance. However, using the MMRM analysis with the UN covariance could lead to convergence problems for numerical optimization, especially in trials with a small-sample size. Although the so-called sandwich covariance estimator is robust to misspecification of the covariance structure, its performance deteriorates in settings with small-sample size. We investigated the performance of the sandwich covariance estimator and covariance estimators adjusted for small-sample bias proposed by Kauermann and Carroll (J Am Stat Assoc 2001; 96: 1387-1396) and Mancl and DeRouen (Biometrics 2001; 57: 126-134) fitting simpler covariance structures through a simulation study. In terms of the type 1 error rate and coverage probability of confidence intervals, Mancl and DeRouen's covariance estimator with compound symmetry, first-order autoregressive (AR(1)), heterogeneous AR(1), and antedependence structures performed better than the original sandwich estimator and Kauermann and Carroll's estimator with these structures in the scenarios where the variance increased across visits. The performance based on Mancl and DeRouen's estimator with these structures was nearly equivalent to that based on the Kenward-Roger method for adjusting the standard errors and degrees of freedom with the UN structure. The model-based covariance estimator with the UN structure under unadjustment of the degrees of freedom, which is frequently used in applications
Using Incidence Sampling to Estimate Covariances.
ERIC Educational Resources Information Center
Knapp, Thomas R.
1979-01-01
This paper presents the generalized symmetric means approach to the estimation of population covariances, complete with derivations and examples. Particular attention is paid to the problem of missing data, which is handled very naturally in the incidence sampling framework. (CTM)
Conformally covariant parametrizations for relativistic initial data
NASA Astrophysics Data System (ADS)
Delay, Erwann
2017-01-01
We revisit the Lichnerowicz-York method, and an alternative method of York, in order to obtain some conformally covariant systems. This type of parametrization is certainly more natural for non constant mean curvature initial data.
Earth Observing System Covariance Realism Updates
NASA Technical Reports Server (NTRS)
Ojeda Romero, Juan A.; Miguel, Fred
2017-01-01
This presentation will be given at the International Earth Science Constellation Mission Operations Working Group meetings June 13-15, 2017 to discuss the Earth Observing System Covariance Realism updates.
Covariance Spectroscopy for Fissile Material Detection
Rusty Trainham, Jim Tinsley, Paul Hurley, Ray Keegan
2009-06-02
Nuclear fission produces multiple prompt neutrons and gammas at each fission event. The resulting daughter nuclei continue to emit delayed radiation as neutrons boil off, beta decay occurs, etc. All of the radiations are causally connected, and therefore correlated. The correlations are generally positive, but when different decay channels compete, so that some radiations tend to exclude others, negative correlations could also be observed. A similar problem of reduced complexity is that of cascades radiation, whereby a simple radioactive decay produces two or more correlated gamma rays at each decay. Covariance is the usual means for measuring correlation, and techniques of covariance mapping may be useful to produce distinct signatures of special nuclear materials (SNM). A covariance measurement can also be used to filter data streams because uncorrelated signals are largely rejected. The technique is generally more effective than a coincidence measurement. In this poster, we concentrate on cascades and the covariance filtering problem.
Covariation bias in panic-prone individuals.
Pauli, P; Montoya, P; Martz, G E
1996-11-01
Covariation estimates between fear-relevant (FR; emergency situations) or fear-irrelevant (FI; mushrooms and nudes) stimuli and an aversive outcome (electrical shock) were examined in 10 high-fear (panic-prone) and 10 low-fear respondents. When the relation between slide category and outcome was random (illusory correlation), only high-fear participants markedly overestimated the contingency between FR slides and shocks. However, when there was a high contingency of shocks following FR stimuli (83%) and a low contingency of shocks following FI stimuli (17%), the group difference vanished. Reversal of contingencies back to random induced a covariation bias for FR slides in high- and low-fear respondents. Results indicate that panic-prone respondents show a covariation bias for FR stimuli and that the experience of a high contingency between FR slides and aversive outcomes may foster such a covariation bias even in low-fear respondents.
On corrected score approach for proportional hazards model with covariate measurement error.
Song, Xiao; Huang, Yijian
2005-09-01
In the presence of covariate measurement error with the proportional hazards model, several functional modeling methods have been proposed. These include the conditional score estimator (Tsiatis and Davidian, 2001, Biometrika 88, 447-458), the parametric correction estimator (Nakamura, 1992, Biometrics 48, 829-838), and the nonparametric correction estimator (Huang and Wang, 2000, Journal of the American Statistical Association 95, 1209-1219) in the order of weaker assumptions on the error. Although they are all consistent, each suffers from potential difficulties with small samples and substantial measurement error. In this article, upon noting that the conditional score and parametric correction estimators are asymptotically equivalent in the case of normal error, we investigate their relative finite sample performance and discover that the former is superior. This finding motivates a general refinement approach to parametric and nonparametric correction methods. The refined correction estimators are asymptotically equivalent to their standard counterparts, but have improved numerical properties and perform better when the standard estimates do not exist or are outliers. Simulation results and application to an HIV clinical trial are presented.
Noncommutative Gauge Theory with Covariant Star Product
Zet, G.
2010-08-04
We present a noncommutative gauge theory with covariant star product on a space-time with torsion. In order to obtain the covariant star product one imposes some restrictions on the connection of the space-time. Then, a noncommutative gauge theory is developed applying this product to the case of differential forms. Some comments on the advantages of using a space-time with torsion to describe the gravitational field are also given.
Breeding curvature from extended gauge covariance
NASA Astrophysics Data System (ADS)
Aldrovandi, R.
1991-05-01
Independence between spacetime and “internal” space in gauge theories is related to the adjoint-covariant behaviour of the gauge potential. The usual gauge scheme is modified to allow a coupling between both spaces. Gauging spacetime translations produce field equations similar to Einstein equations. A curvature-like quantity of mixed differential-algebraic character emerges. Enlarged conservation laws are present, pointing to the presence of an covariance.
Covariate analysis of bivariate survival data
Bennett, L.E.
1992-01-01
The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methods have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.
Covariant action for type IIB supergravity
NASA Astrophysics Data System (ADS)
Sen, Ashoke
2016-07-01
Taking clues from the recent construction of the covariant action for type II and heterotic string field theories, we construct a manifestly Lorentz covariant action for type IIB supergravity, and discuss its gauge fixing maintaining manifest Lorentz invariance. The action contains a (non-gravitating) free 4-form field besides the usual fields of type IIB supergravity. This free field, being completely decoupled from the interacting sector, has no physical consequence.
Vector Meson Property in Covariant Classification Scheme
NASA Astrophysics Data System (ADS)
Oda, Masuho
2004-08-01
Recently our collaboration group has proposed the covariant classification shceme of hadrons, leading to possible existence of two ground state vector mesons. One is corresponding to ordinary ρ nonet and the other is extra ρ nonet. We investigate the decay property of ω(1250) and ρ(1250) in the covariant classification scheme. And it is shown that ω(1250) is promising candidate of our extra ω meson.
Phase-covariant quantum cloning of qudits
Fan Heng; Imai, Hiroshi; Matsumoto, Keiji; Wang, Xiang-Bin
2003-02-01
We study the phase-covariant quantum cloning machine for qudits, i.e., the input states in a d-level quantum system have complex coefficients with arbitrary phase but constant module. A cloning unitary transformation is proposed. After optimizing the fidelity between input state and single qudit reduced density operator of output state, we obtain the optimal fidelity for 1 to 2 phase-covariant quantum cloning of qudits and the corresponding cloning transformation.
Non-parametric three-way mixed ANOVA with aligned rank tests.
Oliver-Rodríguez, Juan C; Wang, X T
2015-02-01
Research problems that require a non-parametric analysis of multifactor designs with repeated measures arise in the behavioural sciences. There is, however, a lack of available procedures in commonly used statistical packages. In the present study, a generalization of the aligned rank test for the two-way interaction is proposed for the analysis of the typical sources of variation in a three-way analysis of variance (ANOVA) with repeated measures. It can be implemented in the usual statistical packages. Its statistical properties are tested by using simulation methods with two sample sizes (n = 30 and n = 10) and three distributions (normal, exponential and double exponential). Results indicate substantial increases in power for non-normal distributions in comparison with the usual parametric tests. Similar levels of Type I error for both parametric and aligned rank ANOVA were obtained with non-normal distributions and large sample sizes. Degrees-of-freedom adjustments for Type I error control in small samples are proposed. The procedure is applied to a case study with 30 participants per group where it detects gender differences in linguistic abilities in blind children not shown previously by other methods.
A nonparametric vs. latent class model of general practitioner utilization: evidence from Canada.
McLeod, Logan
2011-12-01
Predicting health care utilization is the foundation of many health economics analyses, such as calculating risk-adjustment capitation payments or measuring equity in health care utilization. The most common econometric models of physician utilization are parametric count data models, since the most common metric of physician utilization is the number of physician visits. This paper makes two distinct contributions to the literature analyzing GP utilization: (i) it is the first to use a nonparametric kernel conditional density estimator to model GP utilization and compare the predicted utilization with that from a latent class negative binomial model; and (ii) it uses panel data to control for the potential endogeneity between self-reported health status and the number of GP visits. The goodness-of-fit results show the kernel conditional density estimator provides a better fit to the observed distribution of GP visits than the latent class negative binomial model. There are some meaningful differences in how the predicted conditional mean number of GP visits changes with a change in an individual's characteristics, called the incremental effect (IE), between the kernel conditional density estimator and the latent class negative binomial model. The most notable differences are observed in the right tail of the distribution where the IEs from the latent class negative binomial model are up to 190 times the magnitude of the IEs from the kernel conditional density estimator.
Low-dimensional Representation of Error Covariance
NASA Technical Reports Server (NTRS)
Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan
2000-01-01
Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.
Generating Covariance Data with Nuclear Models
NASA Astrophysics Data System (ADS)
Koning, A. J.
2006-04-01
A reliable assessment of the uncertainties in calculated integral reactor parameters depends directly on the uncertainties of the underlying nuclear data. Unfortunately, covariance nuclear data are scarce, not only because a significant experimental database for the isotope under consideration must be available, but also because the covariance evaluation process can be rather complex and time-consuming. We attack this problem with a systematical approach and developed, following the initial ideas of D. Smith (ANL), a method to produce a complete covariance matrix for evaluated data files on the basis of uncertainties of nuclear model parameters. This is accomplished by subjecting the nuclear model code TALYS to a Monte Carlo method for perturbing input parameters, an approach that is now possible with the available computer power. After establishing uncertainties for parameters of the optical model, level densities, gamma-ray strength functions, fission barriers etc., we produce random input files for the TALYS code. These deliver, provided enough calculations (samples) are performed, uncertainties + all off-diagonal elements for all open reaction channels. The uncertainties of the nuclear model parameter are tuned such that the calculated cross section uncertainties coincide, to a reasonable extent, with uncertainties obtained from covariance evaluations based on experimental data. If this method proves to be successful, and we will show here that we are not too far off, it will enable mass production of credible covariance data for isotopes for which no covariance data exists……and this constitutes a very significant part of the periodic table of elements.
Lorentz covariance of loop quantum gravity
NASA Astrophysics Data System (ADS)
Rovelli, Carlo; Speziale, Simone
2011-05-01
The kinematics of loop gravity can be given a manifestly Lorentz-covariant formulation: the conventional SU(2)-spin-network Hilbert space can be mapped to a space K of SL(2,C) functions, where Lorentz covariance is manifest. K can be described in terms of a certain subset of the projected spin networks studied by Livine, Alexandrov and Dupuis. It is formed by SL(2,C) functions completely determined by their restriction on SU(2). These are square-integrable in the SU(2) scalar product, but not in the SL(2,C) one. Thus, SU(2)-spin-network states can be represented by Lorentz-covariant SL(2,C) functions, as two-component photons can be described in the Lorentz-covariant Gupta-Bleuler formalism. As shown by Wolfgang Wieland in a related paper, this manifestly Lorentz-covariant formulation can also be directly obtained from canonical quantization. We show that the spinfoam dynamics of loop quantum gravity is locally SL(2,C)-invariant in the bulk, and yields states that are precisely in K on the boundary. This clarifies how the SL(2,C) spinfoam formalism yields an SU(2) theory on the boundary. These structures define a tidy Lorentz-covariant formalism for loop gravity.
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
A Machine Learning Framework for Plan Payment Risk Adjustment.
Rose, Sherri
2016-12-01
To introduce cross-validation and a nonparametric machine learning framework for plan payment risk adjustment and then assess whether they have the potential to improve risk adjustment. 2011-2012 Truven MarketScan database. We compare the performance of multiple statistical approaches within a broad machine learning framework for estimation of risk adjustment formulas. Total annual expenditure was predicted using age, sex, geography, inpatient diagnoses, and hierarchical condition category variables. The methods included regression, penalized regression, decision trees, neural networks, and an ensemble super learner, all in concert with screening algorithms that reduce the set of variables considered. The performance of these methods was compared based on cross-validated R(2) . Our results indicate that a simplified risk adjustment formula selected via this nonparametric framework maintains much of the efficiency of a traditional larger formula. The ensemble approach also outperformed classical regression and all other algorithms studied. The implementation of cross-validated machine learning techniques provides novel insight into risk adjustment estimation, possibly allowing for a simplified formula, thereby reducing incentives for increased coding intensity as well as the ability of insurers to "game" the system with aggressive diagnostic upcoding. © Health Research and Educational Trust.
Jiang, Honghua; Kulkarni, Pandurang M; Mallinckrodt, Craig H; Shurzinske, Linda; Molenberghs, Geert; Lipkovich, Ilya
2015-01-01
The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi-square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi-squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. Copyright © 2015 John Wiley & Sons, Ltd.
Structuring feature space: a non-parametric method for volumetric transfer function generation.
Maciejewski, Ross; Woo, Insoo; Chen, Wei; Ebert, David S
2009-01-01
The use of multi-dimensional transfer functions for direct volume rendering has been shown to be an effective means of extracting materials and their boundaries for both scalar and multivariate data. The most common multi-dimensional transfer function consists of a two-dimensional (2D) histogram with axes representing a subset of the feature space (e.g., value vs. value gradient magnitude), with each entry in the 2D histogram being the number of voxels at a given feature space pair. Users then assign color and opacity to the voxel distributions within the given feature space through the use of interactive widgets (e.g., box, circular, triangular selection). Unfortunately, such tools lead users through a trial-and-error approach as they assess which data values within the feature space map to a given area of interest within the volumetric space. In this work, we propose the addition of non-parametric clustering within the transfer function feature space in order to extract patterns and guide transfer function generation. We apply a non-parametric kernel density estimation to group voxels of similar features within the 2D histogram. These groups are then binned and colored based on their estimated density, and the user may interactively grow and shrink the binned regions to explore feature boundaries and extract regions of interest. We also extend this scheme to temporal volumetric data in which time steps of 2D histograms are composited into a histogram volume. A three-dimensional (3D) density estimation is then applied, and users can explore regions within the feature space across time without adjusting the transfer function at each time step. Our work enables users to effectively explore the structures found within a feature space of the volume and provide a context in which the user can understand how these structures relate to their volumetric data. We provide tools for enhanced exploration and manipulation of the transfer function, and we show that the initial
Confidence interval of difference of proportions in logistic regression in presence of covariates.
Reeve, Russell
2016-03-16
Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.
Covariation in the human masticatory apparatus.
Noback, Marlijn L; Harvati, Katerina
2015-01-01
Many studies have described shape variation of the modern human cranium in relation to subsistence; however, patterns of covariation within the masticatory apparatus (MA) remain largely unexplored. The patterns and intensity of shape covariation, and how this is related to diet, are essential for understanding the evolution of functional masticatory adaptations of the human cranium. Within a worldwide sample (n = 255) of 15 populations with different modes of subsistence, we use partial least squares analysis to study the relationships between three components of the MA: upper dental arch, masseter muscle, and temporalis muscle attachments. We show that the shape of the masseter muscle and the shape of the temporalis muscle clearly covary with one another, but that the shape of the dental arch seems to be rather independent of the masticatory muscles. On the contrary, when relative positioning, orientation, and size of the masticatory components is included in the analysis, the dental arch shows the highest covariation with the other cranial parts, indicating that these additional factors are more important than just shape with regard to covariation within the MA. Covariation patterns among these cranial regions differ mainly between hunting-fishing and gathering-agriculture groups, possibly relating to greater masticatory strains resulting from a large meat component in the diet. High-strain groups show stronger covariation between upper dental arch and masticatory muscle shape when compared with low-strain groups. These results help to provide a clearer understanding of constraints and interlinkage of shape variation within the human MA and allow for more realistic modeling and predictions in future biomechanical studies. © 2014 Wiley Periodicals, Inc.
Kovanda, Laura L; Desai, Amit V; Lu, Qiaoyang; Townsend, Robert W; Akhtar, Shahzad; Bonate, Peter; Hope, William W
2016-08-01
Isavuconazonium sulfate (Cresemba; Astellas Pharma Inc.), a water-soluble prodrug of the triazole antifungal agent isavuconazole, is available for the treatment of invasive aspergillosis (IA) and invasive mucormycosis. A population pharmacokinetic (PPK) model was constructed using nonparametric estimation to compare the pharmacokinetic (PK) behaviors of isavuconazole in patients treated in the phase 3 VITAL open-label clinical trial, which evaluated the efficacy and safety of the drug for treatment of renally impaired IA patients and patients with invasive fungal disease (IFD) caused by emerging molds, yeasts, and dimorphic fungi. Covariates examined were body mass index (BMI), weight, race, impact of estimated glomerular filtration rate (eGFR) on clearance (CL), and impact of weight on volume. PK parameters were compared based on IFD type and other patient characteristics. Simulations were performed to describe the MICs covered by the clinical dosing regimen. Concentrations (n = 458) from 136 patients were used to construct a 2-compartment model (first-order absorption compartment and central compartment). Weight-related covariates affected clearance, but eGFR did not. PK parameters and intersubject variability of CL were similar across different IFD groups and populations. Target attainment analyses demonstrated that the clinical dosing regimen would be sufficient for total drug area under the concentration-time curve (AUC)/MIC targets ranging from 50.5 for Aspergillus spp. (up to the CLSI MIC of 0.5 mg/liter) to 270 and 5,053 for Candida albicans (up to MICs of 0.125 and 0.004 mg/liter, respectively) and 312 for non-albicans Candida spp. (up to a MIC of 0.125 mg/liter). The estimations for Candida spp. were exploratory considering that no patients with Candida infections were included in the current analyses. (The VITAL trial is registered at ClinicalTrials.gov under number NCT00634049.).
Kovanda, Laura L.; Desai, Amit V.; Lu, Qiaoyang; Townsend, Robert W.; Akhtar, Shahzad; Bonate, Peter
2016-01-01
Isavuconazonium sulfate (Cresemba; Astellas Pharma Inc.), a water-soluble prodrug of the triazole antifungal agent isavuconazole, is available for the treatment of invasive aspergillosis (IA) and invasive mucormycosis. A population pharmacokinetic (PPK) model was constructed using nonparametric estimation to compare the pharmacokinetic (PK) behaviors of isavuconazole in patients treated in the phase 3 VITAL open-label clinical trial, which evaluated the efficacy and safety of the drug for treatment of renally impaired IA patients and patients with invasive fungal disease (IFD) caused by emerging molds, yeasts, and dimorphic fungi. Covariates examined were body mass index (BMI), weight, race, impact of estimated glomerular filtration rate (eGFR) on clearance (CL), and impact of weight on volume. PK parameters were compared based on IFD type and other patient characteristics. Simulations were performed to describe the MICs covered by the clinical dosing regimen. Concentrations (n = 458) from 136 patients were used to construct a 2-compartment model (first-order absorption compartment and central compartment). Weight-related covariates affected clearance, but eGFR did not. PK parameters and intersubject variability of CL were similar across different IFD groups and populations. Target attainment analyses demonstrated that the clinical dosing regimen would be sufficient for total drug area under the concentration-time curve (AUC)/MIC targets ranging from 50.5 for Aspergillus spp. (up to the CLSI MIC of 0.5 mg/liter) to 270 and 5,053 for Candida albicans (up to MICs of 0.125 and 0.004 mg/liter, respectively) and 312 for non-albicans Candida spp. (up to a MIC of 0.125 mg/liter). The estimations for Candida spp. were exploratory considering that no patients with Candida infections were included in the current analyses. (The VITAL trial is registered at ClinicalTrials.gov under number NCT00634049.) PMID:27185799
Convex Banding of the Covariance Matrix
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings. PMID:28042189
A sparse Ising model with covariates.
Cheng, Jie; Levina, Elizaveta; Wang, Pei; Zhu, Ji
2014-12-01
There has been a lot of work fitting Ising models to multivariate binary data in order to understand the conditional dependency relationships between the variables. However, additional covariates are frequently recorded together with the binary data, and may influence the dependence relationships. Motivated by such a dataset on genomic instability collected from tumor samples of several types, we propose a sparse covariate dependent Ising model to study both the conditional dependency within the binary data and its relationship with the additional covariates. This results in subject-specific Ising models, where the subject's covariates influence the strength of association between the genes. As in all exploratory data analysis, interpretability of results is important, and we use ℓ1 penalties to induce sparsity in the fitted graphs and in the number of selected covariates. Two algorithms to fit the model are proposed and compared on a set of simulated data, and asymptotic results are established. The results on the tumor dataset and their biological significance are discussed in detail.
Convex Banding of the Covariance Matrix.
Bien, Jacob; Bunea, Florentina; Xiao, Luo
2016-01-01
We introduce a new sparse estimator of the covariance matrix for high-dimensional models in which the variables have a known ordering. Our estimator, which is the solution to a convex optimization problem, is equivalently expressed as an estimator which tapers the sample covariance matrix by a Toeplitz, sparsely-banded, data-adaptive matrix. As a result of this adaptivity, the convex banding estimator enjoys theoretical optimality properties not attained by previous banding or tapered estimators. In particular, our convex banding estimator is minimax rate adaptive in Frobenius and operator norms, up to log factors, over commonly-studied classes of covariance matrices, and over more general classes. Furthermore, it correctly recovers the bandwidth when the true covariance is exactly banded. Our convex formulation admits a simple and efficient algorithm. Empirical studies demonstrate its practical effectiveness and illustrate that our exactly-banded estimator works well even when the true covariance matrix is only close to a banded matrix, confirming our theoretical results. Our method compares favorably with all existing methods, in terms of accuracy and speed. We illustrate the practical merits of the convex banding estimator by showing that it can be used to improve the performance of discriminant analysis for classifying sound recordings.
Progress on Nuclear Data Covariances: AFCI-1.2 Covariance Library
Oblozinsky,P.; Oblozinsky,P.; Mattoon,C.M.; Herman,M.; Mughabghab,S.F.; Pigni,M.T.; Talou,P.; Hale,G.M.; Kahler,A.C.; Kawano,T.; Little,R.C.; Young,P.G
2009-09-28
Improved neutron cross section covariances were produced for 110 materials including 12 light nuclei (coolants and moderators), 78 structural materials and fission products, and 20 actinides. Improved covariances were organized into AFCI-1.2 covariance library in 33-energy groups, from 10{sup -5} eV to 19.6 MeV. BNL contributed improved covariance data for the following materials: {sup 23}Na and {sup 55}Mn where more detailed evaluation was done; improvements in major structural materials {sup 52}Cr, {sup 56}Fe and {sup 58}Ni; improved estimates for remaining structural materials and fission products; improved covariances for 14 minor actinides, and estimates of mubar covariances for {sup 23}Na and {sup 56}Fe. LANL contributed improved covariance data for {sup 235}U and {sup 239}Pu including prompt neutron fission spectra and completely new evaluation for {sup 240}Pu. New R-matrix evaluation for {sup 16}O including mubar covariances is under completion. BNL assembled the library and performed basic testing using improved procedures including inspection of uncertainty and correlation plots for each material. The AFCI-1.2 library was released to ANL and INL in August 2009.
ERIC Educational Resources Information Center
Zeytun, Aysel Sen; Cetinkaya, Bulent; Erbas, Ayhan Kursat
2010-01-01
Various studies suggest that covariational reasoning plays an important role on understanding the fundamental ideas of calculus and modeling dynamic functional events. The purpose of this study was to investigate a group of mathematics teachers' covariational reasoning abilities and predictions about their students. Data were collected through…
“Nonparametric Local Smoothing” is not image registration
2012-01-01
Background Image registration is one of the most important and universally useful computational tasks in biomedical image analysis. A recent article by Xing & Qiu (IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10):2081–2092, 2011) is based on an inappropriately narrow conceptualization of the image registration problem as the task of making two images look alike, which disregards whether the established spatial correspondence is plausible. The authors propose a new algorithm, Nonparametric Local Smoothing (NLS) for image registration, but use image similarities alone as a measure of registration performance, although these measures do not relate reliably to the realism of the correspondence map. Results Using data obtained from its authors, we show experimentally that the method proposed by Xing & Qiu is not an effective registration algorithm. While it optimizes image similarity, it does not compute accurate, interpretable transformations. Even judged by image similarity alone, the proposed method is consistently outperformed by a simple pixel permutation algorithm, which is known by design not to compute valid registrations. Conclusions This study has demonstrated that the NLS algorithm proposed recently for image registration, and published in one of the most respected journals in computer science, is not, in fact, an effective registration method at all. Our results also emphasize the general need to apply registration evaluation criteria that are sensitive to whether correspondences are accurate and mappings between images are physically interpretable. These goals cannot be achieved by simply reporting image similarities. PMID:23116330
A nonparametric Bayesian framework for constructing flexible feature representations.
Austerweil, Joseph L; Griffiths, Thomas L
2013-10-01
Representations are a key explanatory device used by cognitive psychologists to account for human behavior. Understanding the effects of context and experience on the representations people use is essential, because if two people encode the same stimulus using different representations, their response to that stimulus may be different. We present a computational framework that can be used to define models that flexibly construct feature representations (where by a feature we mean a part of the image of an object) for a set of observed objects, based on nonparametric Bayesian statistics. Austerweil and Griffiths (2011) presented an initial model constructed in this framework that captures how the distribution of parts affects the features people use to represent a set of objects. We build on this work in three ways. First, although people use features that can be transformed on each observation (e.g., translate on the retinal image), many existing feature learning models can only recognize features that are not transformed (occur identically each time). Consequently, we extend the initial model to infer features that are invariant over a set of transformations, and learn different structures of dependence between feature transformations. Second, we compare two possible methods for capturing the manner that categorization affects feature representations. Finally, we present a model that learns features incrementally, capturing an effect of the order of object presentation on the features people learn. We conclude by considering the implications and limitations of our empirical and theoretical results.
Probability Machines: Consistent Probability Estimation Using Nonparametric Learning Machines
Malley, J. D.; Kruppa, J.; Dasgupta, A.; Malley, K. G.; Ziegler, A.
2011-01-01
Summary Background Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem. Objectives The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities. Methods Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosis of diabetes in Pima Indians. Results Simulations demonstrate the validity of the method. With the real data application, we show the accuracy and practicality of this approach. We provide sample code from R packages in which the probability estimation is already available. This means that all calculations can be performed using existing software. Conclusions Random forest algorithms as well as nearest neighbor approaches are valid machine learning methods for estimating individual probabilities for binary responses. Freely available implementations are available in R and may be used for applications. PMID:21915433
Nonparametric Joint Shape and Feature Priors for Image Segmentation.
Erdil, Ertunc; Ghani, Muhammad Usman; Rada, Lavdie; Argunsah, Ali Ozgur; Unay, Devrim; Tasdizen, Tolga; Cetin, Mujdat
2017-11-01
In many image segmentation problems involving limited and low-quality data, employing statistical prior information about the shapes of the objects to be segmented can significantly improve the segmentation result. However, defining probability densities in the space of shapes is an open and challenging problem, especially if the object to be segmented comes from a shape density involving multiple modes (classes). Existing techniques in the literature estimate the underlying shape distribution by extending Parzen density estimator to the space of shapes. In these methods, the evolving curve may converge to a shape from a wrong mode of the posterior density when the observed intensities provide very little information about the object boundaries. In such scenarios, employing both shape- and class-dependent discriminative feature priors can aid the segmentation process. Such features may involve, e.g., intensity-based, textural, or geometric information about the objects to be segmented. In this paper, we propose a segmentation algorithm that uses nonparametric joint shape and feature priors constructed by Parzen density estimation. We incorporate the learned joint shape and feature prior distribution into a maximum a posteriori estimation framework for segmentation. The resulting optimization problem is solved using active contours. We present experimental results on a variety of synthetic and real data sets from several fields involving multimodal shape densities. Experimental results demonstrate the potential of the proposed method.
Transition redshift: new constraints from parametric and nonparametric methods
Rani, Nisha; Mahajan, Shobhit; Mukherjee, Amitabha; Jain, Deepak; Pires, Nilza E-mail: djain@ddu.du.ac.in E-mail: amimukh@gmail.com
2015-12-01
In this paper, we use the cosmokinematics approach to study the accelerated expansion of the Universe. This is a model independent approach and depends only on the assumption that the Universe is homogeneous and isotropic and is described by the FRW metric. We parametrize the deceleration parameter, q(z), to constrain the transition redshift (z{sub t}) at which the expansion of the Universe goes from a decelerating to an accelerating phase. We use three different parametrizations of q(z) namely, q{sub I}(z)=q{sub 1}+q{sub 2}z, q{sub II} (z) = q{sub 3} + q{sub 4} ln (1 + z) and q{sub III} (z)=½+q{sub 5}/(1+z){sup 2}. A joint analysis of the age of galaxies, strong lensing and supernovae Ia data indicates that the transition redshift is less than unity i.e. z{sub t} < 1. We also use a nonparametric approach (LOESS+SIMEX) to constrain z{sub t}. This too gives z{sub t} < 1 which is consistent with the value obtained by the parametric approach.
A Nonparametric Test for Equality of Survival Medians
Rahbar, Mohammad H.; Chen, Zhongxue; Jeon, Sangchoon; Gardiner, Joseph C.; Ning, Jing
2014-01-01
In clinical trials researchers often encounter testing for equality of survival medians across study arms based on censored data. Even though Brookmeyer-Crowley (BC) introduced a method for comparing medians of several survival distributions, still some researchers misuse procedures which are designed for testing the homogeneity of survival curves. These procedures include the log-rank, Wilcoxon, and the Cox model. This practice leads to inflation of the probability of a type I error, particularly when the underlying assumptions of these procedures are not met. We propose a new nonparametric method for testing the equality of several survival medians based on Kaplan-Meier estimation from randomly right censored data. We derive asymptotic properties of this test statistic. Through simulations we compute and compare the empirical probabilities of type I errors and power of this new procedure with that of the Brookmeyer-Crowley (BC), the log-rank and the Wilcoxon method. Our simulation results indicate that performance of these test procedures depends on the level of censoring and appropriateness of the underlying assumptions. When the objective is to test homogeneity of survival medians rather than survival curves and the assumptions of these tests are not met, some of these procedures severely inflate the probability of a type I error. In these situations, our test statistic provides an alternative to the BC test. PMID:22302559
Nonparametric reconstruction of the dark energy equation of state
Holsclaw, Tracy; Sanso, Bruno; Lee, Herbert; Alam, Ujjaini; Heitmann, Katrin; Habib, Salman; Higdon, David
2010-11-15
A basic aim of ongoing and upcoming cosmological surveys is to unravel the mystery of dark energy. In the absence of a compelling theory to test, a natural approach is to better characterize the properties of dark energy in search of clues that can lead to a more fundamental understanding. One way to view this characterization is the improved determination of the redshift-dependence of the dark energy equation of state parameter, w(z). To do this requires a robust and bias-free method for reconstructing w(z) from data that does not rely on restrictive expansion schemes or assumed functional forms for w(z). We present a new nonparametric reconstruction method that solves for w(z) as a statistical inverse problem, based on a Gaussian process representation. This method reliably captures nontrivial behavior of w(z) and provides controlled error bounds. We demonstrate the power of the method on different sets of simulated supernova data; the approach can be easily extended to include diverse cosmological probes.
Bayesian nonparametric clustering in phylogenetics: modeling antigenic evolution in influenza.
Cybis, Gabriela B; Sinsheimer, Janet S; Bedford, Trevor; Rambaut, Andrew; Lemey, Philippe; Suchard, Marc A
2017-01-18
Influenza is responsible for up to 500,000 deaths every year, and antigenic variability represents much of its epidemiological burden. To visualize antigenic differences across many viral strains, antigenic cartography methods use multidimensional scaling on binding assay data to map influenza antigenicity onto a low-dimensional space. Analysis of such assay data ideally leads to natural clustering of influenza strains of similar antigenicity that correlate with sequence evolution. To understand the dynamics of these antigenic groups, we present a framework that jointly models genetic and antigenic evolution by combining multidimensional scaling of binding assay data, Bayesian phylogenetic machinery and nonparametric clustering methods. We propose a phylogenetic Chinese restaurant process that extends the current process to incorporate the phylogenetic dependency structure between strains in the modeling of antigenic clusters. With this method, we are able to use the genetic information to better understand the evolution of antigenicity throughout epidemics, as shown in applications of this model to H1N1 influenza. Copyright © 2017 John Wiley & Sons, Ltd.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Nonparametric Bayes Classification and Hypothesis Testing on Manifolds
Bhattacharya, Abhishek; Dunson, David
2012-01-01
Our first focus is prediction of a categorical response variable using features that lie on a general manifold. For example, the manifold may correspond to the surface of a hypersphere. We propose a general kernel mixture model for the joint distribution of the response and predictors, with the kernel expressed in product form and dependence induced through the unknown mixing measure. We provide simple sufficient conditions for large support and weak and strong posterior consistency in estimating both the joint distribution of the response and predictors and the conditional distribution of the response. Focusing on a Dirichlet process prior for the mixing measure, these conditions hold using von Mises-Fisher kernels when the manifold is the unit hypersphere. In this case, Bayesian methods are developed for efficient posterior computation using slice sampling. Next we develop Bayesian nonparametric methods for testing whether there is a difference in distributions between groups of observations on the manifold having unknown densities. We prove consistency of the Bayes factor and develop efficient computational methods for its calculation. The proposed classification and testing methods are evaluated using simulation examples and applied to spherical data applications. PMID:22754028
Nonparametric identification of structural modifications in Laplace domain
NASA Astrophysics Data System (ADS)
Suwała, G.; Jankowski, Ł.
2017-02-01
This paper proposes and experimentally verifies a Laplace-domain method for identification of structural modifications, which (1) unlike time-domain formulations, allows the identification to be focused on these parts of the frequency spectrum that have a high signal-to-noise ratio, and (2) unlike frequency-domain formulations, decreases the influence of numerical artifacts related to the particular choice of the FFT exponential window decay. In comparison to the time-domain approach proposed earlier, advantages of the proposed method are smaller computational cost and higher accuracy, which leads to reliable performance in more difficult identification cases. Analytical formulas for the first- and second-order sensitivity analysis are derived. The approach is based on a reduced nonparametric model, which has the form of a set of selected structural impulse responses. Such a model can be collected purely experimentally, which obviates the need for design and laborious updating of a parametric model, such as a finite element model. The approach is verified experimentally using a 26-node lab 3D truss structure and 30 identification cases of a single mass modification or two concurrent mass modifications.
Strategies for conditional two-locus nonparametric linkage analysis.
Angquist, Lars; Hössjer, Ola; Groop, Leif
2008-01-01
In this article we deal with two-locus nonparametric linkage (NPL) analysis, mainly in the context of conditional analysis. This means that one incorporates single-locus analysis information through conditioning when performing a two-locus analysis. Here we describe different strategies for using this approach. Cox et al. [Nat Genet 1999;21:213-215] implemented this as follows: (i) Calculate the one-locus NPL process over the included genome region(s). (ii) Weight the individual pedigree NPL scores using a weighting function depending on the NPL scores for the corresponding pedigrees at speci fi c conditioning loci. We generalize this by conditioning with respect to the inheritance vector rather than the NPL score and by separating between the case of known (prede fi ned) and unknown (estimated) conditioning loci. In the latter case we choose conditioning locus, or loci, according to prede fi ned criteria. The most general approach results in a random number of selected loci, depending on the results from the previous one-locus analysis. Major topics in this article include discussions on optimal score functions with respect to the noncentrality parameter (NCP), and how to calculate adequate p values and perform power calculations. We also discuss issues related to multiple tests which arise from the two-step procedure with several conditioning loci as well as from the genome-wide tests.
Peripheral nerve segmentation using Nonparametric Bayesian Hierarchical Clustering.
Giraldo, Juan J; Álvarez, Mauricio A; Orozco, Álvaro A
2015-01-01
Several cases related to chronic pain, due to accidents, illness or surgical interventions, depend on anesthesiology procedures. These procedures are assisted with ultrasound images. Although, the ultrasound images are a useful instrument in order to guide the specialist in anesthesiology, the lack of intelligibility due to speckle noise, makes the clinical intervention a difficult task. In a similar manner, some artifacts are introduced in the image capturing process, challenging the expertise of anesthesiologists for not confusing the true nerve structures. Accordingly, an assistance methodology using image processing can improve the accuracy in the anesthesia practice. This paper proposes a peripheral nerve segmentation method in medical ultrasound images, based on Nonparametric Bayesian Hierarchical Clustering. The experimental results show segmentation performances with a Mean Squared Error performance of 1.026 ± 0.379 pixels for ulnar nerve, 0.704 ± 0.233 pixels for median nerve and 1.698 ± 0.564 pixels for peroneal nerve. Likewise, the model allows to emphasize other soft structures like muscles and aqueous tissues, that might be useful for an anesthesiologist.
Iranian rainfall series analysis by means of nonparametric tests
NASA Astrophysics Data System (ADS)
Talaee, P. Hosseinzadeh
2014-05-01
The study of the trends and fluctuations in rainfall has received a great deal of attention, since changes in rainfall patterns may lead to floods or droughts. The objective of this study was to analyze the annual, seasonal, and monthly rainfall time series at seven rain gauge stations in the west of Iran for a 40-year period (from October 1969 to September 2009). The homogeneity of the rainfall data sets at the rain gauge stations was checked by using the cumulative deviations test. Three nonparametric tests, namely Kendall, Spearman, and Mann-Kendall, at the 95 % confidence level were used for the trend analysis and the Theil-Sen estimator was applied for determining the magnitudes of the trends. According to the homogeneity analysis, all of the rainfall series except the September series at Vasaj station were found to be homogeneous. The obtained results showed an insignificant trend in the annual and seasonal rainfall series at the majority of the considered stations. Moreover, only three significant trends were observed at the February rainfall of Aghajanbolaghi station, the November series of Vasaj station, and the March rainfall series of Khomigan station. The findings of this study on the temporal trends of rainfall can be implemented to improve the water resources strategies in the study region.
Nonparametric estimation of quantum states, processes and measurements
NASA Astrophysics Data System (ADS)
Lougovski, Pavel; Bennink, Ryan
Quantum state, process, and measurement estimation methods traditionally use parametric models, in which the number and role of relevant parameters is assumed to be known. When such an assumption cannot be justified, a common approach in many disciplines is to fit the experimental data to multiple models with different sets of parameters and utilize an information criterion to select the best fitting model. However, it is not always possible to assume a model with a finite (countable) number of parameters. This typically happens when there are unobserved variables that stem from hidden correlations that can only be unveiled after collecting experimental data. How does one perform quantum characterization in this situation? We present a novel nonparametric method of experimental quantum system characterization based on the Dirichlet Process (DP) that addresses this problem. Using DP as a prior in conjunction with Bayesian estimation methods allows us to increase model complexity (number of parameters) adaptively as the number of experimental observations grows. We illustrate our approach for the one-qubit case and show how a probability density function for an unknown quantum process can be estimated.
Non-parametric and least squares Langley plot methods
NASA Astrophysics Data System (ADS)
Kiedron, P. W.; Michalsky, J. J.
2015-04-01
Langley plots are used to calibrate sun radiometers primarily for the measurement of the aerosol component of the atmosphere that attenuates (scatters and absorbs) incoming direct solar radiation. In principle, the calibration of a sun radiometer is a straightforward application of the Bouguer-Lambert-Beer law V=V>/i>0e-τ ·m, where a plot of ln (V) voltage vs. m air mass yields a straight line with intercept ln (V0). This ln (V0) subsequently can be used to solve for τ for any measurement of V and calculation of m. This calibration works well on some high mountain sites, but the application of the Langley plot calibration technique is more complicated at other, more interesting, locales. This paper is concerned with ferreting out calibrations at difficult sites and examining and comparing a number of conventional and non-conventional methods for obtaining successful Langley plots. The eleven techniques discussed indicate that both least squares and various non-parametric techniques produce satisfactory calibrations with no significant differences among them when the time series of ln (V0)'s are smoothed and interpolated with median and mean moving window filters.
Non-parametric and least squares Langley plot methods
NASA Astrophysics Data System (ADS)
Kiedron, P. W.; Michalsky, J. J.
2016-01-01
Langley plots are used to calibrate sun radiometers primarily for the measurement of the aerosol component of the atmosphere that attenuates (scatters and absorbs) incoming direct solar radiation. In principle, the calibration of a sun radiometer is a straightforward application of the Bouguer-Lambert-Beer law V = V0e-τ ṡ m, where a plot of ln(V) voltage vs. m air mass yields a straight line with intercept ln(V0). This ln(V0) subsequently can be used to solve for τ for any measurement of V and calculation of m. This calibration works well on some high mountain sites, but the application of the Langley plot calibration technique is more complicated at other, more interesting, locales. This paper is concerned with ferreting out calibrations at difficult sites and examining and comparing a number of conventional and non-conventional methods for obtaining successful Langley plots. The 11 techniques discussed indicate that both least squares and various non-parametric techniques produce satisfactory calibrations with no significant differences among them when the time series of ln(V0)'s are smoothed and interpolated with median and mean moving window filters.
Non-parametric reconstruction of cosmological matter perturbations
González, J.E.; Alcaniz, J.S.; Carvalho, J.C. E-mail: alcaniz@on.br
2016-04-01
Perturbative quantities, such as the growth rate (f) and index (γ), are powerful tools to distinguish different dark energy models or modified gravity theories even if they produce the same cosmic expansion history. In this work, without any assumption about the dynamics of the Universe, we apply a non-parametric method to current measurements of the expansion rate H(z) from cosmic chronometers and high-z quasar data and reconstruct the growth factor and rate of linearised density perturbations in the non-relativistic matter component. Assuming realistic values for the matter density parameter Ω{sub m0}, as provided by current CMB experiments, we also reconstruct the evolution of the growth index γ with redshift. We show that the reconstruction of current H(z) data constrains the growth index to γ=0.56 ± 0.12 (2σ) at z = 0.09, which is in full agreement with the prediction of the ΛCDM model and some of its extensions.
Diffeomorphic demons: efficient non-parametric image registration.
Vercauteren, Tom; Pennec, Xavier; Perchant, Aymeric; Ayache, Nicholas
2009-03-01
We propose an efficient non-parametric diffeomorphic image registration algorithm based on Thirion's demons algorithm. In the first part of this paper, we show that Thirion's demons algorithm can be seen as an optimization procedure on the entire space of displacement fields. We provide strong theoretical roots to the different variants of Thirion's demons algorithm. This analysis predicts a theoretical advantage for the symmetric forces variant of the demons algorithm. We show on controlled experiments that this advantage is confirmed in practice and yields a faster convergence. In the second part of this paper, we adapt the optimization procedure underlying the demons algorithm to a space of diffeomorphic transformations. In contrast to many diffeomorphic registration algorithms, our solution is computationally efficient since in practice it only replaces an addition of displacement fields by a few compositions. Our experiments show that in addition to being diffeomorphic, our algorithm provides results that are similar to the ones from the demons algorithm but with transformations that are much smoother and closer to the gold standard, available in controlled experiments, in terms of Jacobians.
Nonparametric Bayes modeling for case control studies with many predictors.
Zhou, Jing; Herring, Amy H; Bhattacharya, Anirban; Olshan, Andrew F; Dunson, David B
2016-03-01
It is common in biomedical research to run case-control studies involving high-dimensional predictors, with the main goal being detection of the sparse subset of predictors having a significant association with disease. Usual analyses rely on independent screening, considering each predictor one at a time, or in some cases on logistic regression assuming no interactions. We propose a fundamentally different approach based on a nonparametric Bayesian low rank tensor factorization model for the retrospective likelihood. Our model allows a very flexible structure in characterizing the distribution of multivariate variables as unknown and without any linear assumptions as in logistic regression. Predictors are excluded only if they have no impact on disease risk, either directly or through interactions with other predictors. Hence, we obtain an omnibus approach for screening for important predictors. Computation relies on an efficient Gibbs sampler. The methods are shown to have high power and low false discovery rates in simulation studies, and we consider an application to an epidemiology study of birth defects.
Detecting communities in networks using a Bayesian nonparametric method
NASA Astrophysics Data System (ADS)
Hu, Shengze; Wang, Zhenwen
2014-07-01
In the real world, a large amount of systems can be described by networks where nodes represent entities and edges the interconnections between them. Community structure in networks is one of the interesting properties revealed in the study of networks. Many methods have been developed to extract communities from networks using the generative models which give the probability of generating networks based on some assumption about the communities. However, many generative models require setting the number of communities in the network. The methods based on such models are lack of practicality, because the number of communities is unknown before determining the communities. In this paper, the Bayesian nonparametric method is used to develop a new community detection method. First, a generative model is built to give the probability of generating the network and its communities. Next, the model parameters and the number of communities are calculated by fitting the model to the actual network. Finally, the communities in the network can be determined using the model parameters. In the experiments, we apply the proposed method to the synthetic and real-world networks, comparing with some other community detection methods. The experimental results show that the proposed method is efficient to detect communities in networks.
Biological Parametric Mapping WITH Robust AND Non-Parametric Statistics
Yang, Xue; Beason-Held, Lori; Resnick, Susan M.; Landman, Bennett A.
2011-01-01
Mapping the quantitative relationship between structure and function in the human brain is an important and challenging problem. Numerous volumetric, surface, regions of interest and voxelwise image processing techniques have been developed to statistically assess potential correlations between imaging and non-imaging metrices. Recently, biological parametric mapping has extended the widely popular statistical parametric mapping approach to enable application of the general linear model to multiple image modalities (both for regressors and regressands) along with scalar valued observations. This approach offers great promise for direct, voxelwise assessment of structural and functional relationships with multiple imaging modalities. However, as presented, the biological parametric mapping approach is not robust to outliers and may lead to invalid inferences (e.g., artifactual low p-values) due to slight mis-registration or variation in anatomy between subjects. To enable widespread application of this approach, we introduce robust regression and non-parametric regression in the neuroimaging context of application of the general linear model. Through simulation and empirical studies, we demonstrate that our robust approach reduces sensitivity to outliers without substantial degradation in power. The robust approach and associated software package provide a reliable way to quantitatively assess voxelwise correlations between structural and functional neuroimaging modalities. PMID:21569856
Biological parametric mapping with robust and non-parametric statistics.
Yang, Xue; Beason-Held, Lori; Resnick, Susan M; Landman, Bennett A
2011-07-15
Mapping the quantitative relationship between structure and function in the human brain is an important and challenging problem. Numerous volumetric, surface, regions of interest and voxelwise image processing techniques have been developed to statistically assess potential correlations between imaging and non-imaging metrices. Recently, biological parametric mapping has extended the widely popular statistical parametric mapping approach to enable application of the general linear model to multiple image modalities (both for regressors and regressands) along with scalar valued observations. This approach offers great promise for direct, voxelwise assessment of structural and functional relationships with multiple imaging modalities. However, as presented, the biological parametric mapping approach is not robust to outliers and may lead to invalid inferences (e.g., artifactual low p-values) due to slight mis-registration or variation in anatomy between subjects. To enable widespread application of this approach, we introduce robust regression and non-parametric regression in the neuroimaging context of application of the general linear model. Through simulation and empirical studies, we demonstrate that our robust approach reduces sensitivity to outliers without substantial degradation in power. The robust approach and associated software package provide a reliable way to quantitatively assess voxelwise correlations between structural and functional neuroimaging modalities. Copyright © 2011 Elsevier Inc. All rights reserved.
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
Covariant Lyapunov vectors for rigid disk systems.
Bosetti, Hadrien; Posch, Harald A
2010-10-05
We carry out extensive computer simulations to study the Lyapunov instability of a two-dimensional hard-disk system in a rectangular box with periodic boundary conditions. The system is large enough to allow the formation of Lyapunov modes parallel to the x-axis of the box. The Oseledec splitting into covariant subspaces of the tangent space is considered by computing the full set of covariant perturbation vectors co-moving with the flow in tangent space. These vectors are shown to be transversal, but generally not orthogonal to each other. Only the angle between covariant vectors associated with immediate adjacent Lyapunov exponents in the Lyapunov spectrum may become small, but the probability of this angle to vanish approaches zero. The stable and unstable manifolds are transverse to each other and the system is hyperbolic.
FAST NEUTRON COVARIANCES FOR EVALUATED DATA FILES.
HERMAN, M.; OBLOZINSKY, P.; ROCHMAN, D.; KAWANO, T.; LEAL, L.
2006-06-05
We describe implementation of the KALMAN code in the EMPIRE system and present first covariance data generated for Gd and Ir isotopes. A complete set of covariances, in the full energy range, was produced for the chain of 8 Gadolinium isotopes for total, elastic, capture, total inelastic (MT=4), (n,2n), (n,p) and (n,alpha) reactions. Our correlation matrices, based on combination of model calculations and experimental data, are characterized by positive mid-range and negative long-range correlations. They differ from the model-generated covariances that tend to show strong positive long-range correlations and those determined solely from experimental data that result in nearly diagonal matrices. We have studied shapes of correlation matrices obtained in the calculations and interpreted them in terms of the underlying reaction models. An important result of this study is the prediction of narrow energy ranges with extremely small uncertainties for certain reactions (e.g., total and elastic).
Covariant Lyapunov vectors for rigid disk systems
Bosetti, Hadrien; Posch, Harald A.
2010-01-01
We carry out extensive computer simulations to study the Lyapunov instability of a two-dimensional hard-disk system in a rectangular box with periodic boundary conditions. The system is large enough to allow the formation of Lyapunov modes parallel to the x-axis of the box. The Oseledec splitting into covariant subspaces of the tangent space is considered by computing the full set of covariant perturbation vectors co-moving with the flow in tangent space. These vectors are shown to be transversal, but generally not orthogonal to each other. Only the angle between covariant vectors associated with immediate adjacent Lyapunov exponents in the Lyapunov spectrum may become small, but the probability of this angle to vanish approaches zero. The stable and unstable manifolds are transverse to each other and the system is hyperbolic. PMID:21151326
Gram-Schmidt algorithms for covariance propagation
NASA Technical Reports Server (NTRS)
Thornton, C. L.; Bierman, G. J.
1977-01-01
This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UD(transpose of U), where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and coloured process noise parameters increase mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.
Covariance for Cone and Wedge Complete Filling
NASA Astrophysics Data System (ADS)
Rascón, C.; Parry, A. O.
2005-03-01
Interfacial phenomena associated with fluid adsorption in two dimensional systems have recently been shown to exhibit hidden symmetries, or covariances, which precisely relate local adsorption properties in different confining geometries. We show that covariance also occurs in three-dimensional systems and is likely to be verifiable experimentally and in Ising model simulations studies. Specifically, we study complete wetting in wedge (W) and cone (C) geometries as bulk coexistence is approached and show that the equilibrium midpoint heights satisfy lc(h,α)=lw(h/2,α), where h measures the partial pressure and α is the tilt angle. This covariance is valid for both short-ranged and long-ranged intermolecular forces and identifies both leading and next-to-leading-order critical exponents and amplitudes in the confining geometries.
NASA Astrophysics Data System (ADS)
Li, X. Y.; Law, S. S.
2010-05-01
A new matrix on the covariance of covariance is formed from the auto/cross-correlation function of acceleration responses of a structure under white noise ambient excitation. The components of the covariance matrix are proved to be function of the modal parameters (modal frequency, mode shape, and damping parameter) of the structure. Information from all the vibration modes of the structure limited by the sampling frequency contributes to these components. The formulated covariance matrix contains more information on the vibration modes of the structure that cannot be obtained by the general methods for extracting modal parameters. When the component of the covariance matrix is used for damage detection, it is found more sensitive to local stiffness reduction than the first few modal frequencies and mode shapes obtained from ambient excitation. A simply supported 31 bar plane truss structure is studied numerically where a multiple damage scenario with different noise levels is identified with satisfactory results.
NASA Astrophysics Data System (ADS)
Hui, Yi; Law, Siu Seong; Ku, Chiu Jen
2017-02-01
Covariance of the auto/cross-covariance matrix based method is studied for the damage identification of a structure with illustrations on its advantages and limitations. The original method is extended for structures under direct white noise excitations. The auto/cross-covariance function of the measured acceleration and its corresponding derivatives are formulated analytically, and the method is modified in two new strategies to enable successful identification with much fewer sensors. Numerical examples are adopted to illustrate the improved method, and the effects of sampling frequency and sampling duration are discussed. Results show that the covariance of covariance calculated from responses of higher order modes of a structure play an important role to the accurate identification of local damage in a structure.
GRAVSAT/GEOPAUSE covariance analysis including geopotential aliasing
NASA Technical Reports Server (NTRS)
Koch, D. W.
1975-01-01
A conventional covariance analysis for the GRAVSAT/GEOPAUSE mission is described in which the uncertainties of approximately 200 parameters, including the geopotential coefficients to degree and order 12, are estimated over three different tracking intervals. The estimated orbital uncertainties for both GRAVSAT and GEOPAUSE reach levels more accurate than presently available. The adjusted measurement bias errors approach the mission goal. Survey errors in the low centimeter range are achieved after ten days of tracking. The ability of the mission to obtain accuracies of geopotential terms to (12, 12) one to two orders of magnitude superior to present accuracy levels is clearly shown. A unique feature of this report is that the aliasing structure of this (12, 12) field is examined. It is shown that uncertainties for unadjusted terms to (12, 12) still exert a degrading effect upon the adjusted error of an arbitrarily selected term of lower degree and order. Finally, the distribution of the aliasing from the unestimated uncertainty of a particular high degree and order geopotential term upon the errors of all remaining adjusted terms is listed in detail.
SCALE-6 Sensitivity/Uncertainty Methods and Covariance Data
Williams, Mark L; Rearden, Bradley T
2008-01-01
Computational methods and data used for sensitivity and uncertainty analysis within the SCALE nuclear analysis code system are presented. The methodology used to calculate sensitivity coefficients and similarity coefficients and to perform nuclear data adjustment is discussed. A description is provided of the SCALE-6 covariance library based on ENDF/B-VII and other nuclear data evaluations, supplemented by 'low-fidelity' approximate covariances. SCALE (Standardized Computer Analyses for Licensing Evaluation) is a modular code system developed by Oak Ridge National Laboratory (ORNL) to perform calculations for criticality safety, reactor physics, and radiation shielding applications. SCALE calculations typically use sequences that execute a predefined series of executable modules to compute particle fluxes and responses like the critical multiplication factor. SCALE also includes modules for sensitivity and uncertainty (S/U) analysis of calculated responses. The S/U codes in SCALE are collectively referred to as TSUNAMI (Tools for Sensitivity and UNcertainty Analysis Methodology Implementation). SCALE-6-scheduled for release in 2008-contains significant new capabilities, including important enhancements in S/U methods and data. The main functions of TSUNAMI are to (a) compute nuclear data sensitivity coefficients and response uncertainties, (b) establish similarity between benchmark experiments and design applications, and (c) reduce uncertainty in calculated responses by consolidating integral benchmark experiments. TSUNAMI includes easy-to-use graphical user interfaces for defining problem input and viewing three-dimensional (3D) geometries, as well as an integrated plotting package.
A Covariance Generation Methodology for Fission Product Yields
NASA Astrophysics Data System (ADS)
Terranova, N.; Serot, O.; Archier, P.; Vallet, V.; De Saint Jean, C.; Sumini, M.
2016-03-01
Recent safety and economical concerns for modern nuclear reactor applications have fed an outstanding interest in basic nuclear data evaluation improvement and completion. It has been immediately clear that the accuracy of our predictive simulation models was strongly affected by our knowledge on input data. Therefore strong efforts have been made to improve nuclear data and to generate complete and reliable uncertainty information able to yield proper uncertainty propagation on integral reactor parameters. Since in modern nuclear data banks (such as JEFF-3.1.1 and ENDF/BVII.1) no correlations for fission yields are given, in the present work we propose a covariance generation methodology for fission product yields. The main goal is to reproduce the existing European library and to add covariance information to allow proper uncertainty propagation in depletion and decay heat calculations. To do so, we adopted the Generalized Least Square Method (GLSM) implemented in CONRAD (COde for Nuclear Reaction Analysis and Data assimilation), developed at CEA-Cadarache. Theoretical values employed in the Bayesian parameter adjustment are delivered thanks to a convolution of different models, representing several quantities in fission yield calculations: the Brosa fission modes for pre-neutron mass distribution, a simplified Gaussian model for prompt neutron emission probability, theWahl systematics for charge distribution and the Madland-England model for the isomeric ratio. Some results will be presented for the thermal fission of U-235, Pu-239 and Pu-241.
Covariance based outlier detection with feature selection.
Zwilling, Chris E; Wang, Michelle Y
2016-08-01
The present covariance based outlier detection algorithm selects from a candidate set of feature vectors that are best at identifying outliers. Features extracted from biomedical and health informatics data can be more informative in disease assessment and there are no restrictions on the nature and number of features that can be tested. But an important challenge for an algorithm operating on a set of features is for it to winnow the effective features from the ineffective ones. The powerful algorithm described in this paper leverages covariance information from the time series data to identify features with the highest sensitivity for outlier identification. Empirical results demonstrate the efficacy of the method.
Covariance Analysis of Gamma Ray Spectra
Trainham, R.; Tinsley, J.
2013-01-01
The covariance method exploits fluctuations in signals to recover information encoded in correlations which are usually lost when signal averaging occurs. In nuclear spectroscopy it can be regarded as a generalization of the coincidence technique. The method can be used to extract signal from uncorrelated noise, to separate overlapping spectral peaks, to identify escape peaks, to reconstruct spectra from Compton continua, and to generate secondary spectral fingerprints. We discuss a few statistical considerations of the covariance method and present experimental examples of its use in gamma spectroscopy.
Covariance analysis of gamma ray spectra
Trainham, R.; Tinsley, J.
2013-01-15
The covariance method exploits fluctuations in signals to recover information encoded in correlations which are usually lost when signal averaging occurs. In nuclear spectroscopy it can be regarded as a generalization of the coincidence technique. The method can be used to extract signal from uncorrelated noise, to separate overlapping spectral peaks, to identify escape peaks, to reconstruct spectra from Compton continua, and to generate secondary spectral fingerprints. We discuss a few statistical considerations of the covariance method and present experimental examples of its use in gamma spectroscopy.
Parametric number covariance in quantum chaotic spectra.
Vinayak; Kumar, Sandeep; Pandey, Akhilesh
2016-03-01
We study spectral parametric correlations in quantum chaotic systems and introduce the number covariance as a measure of such correlations. We derive analytic results for the classical random matrix ensembles using the binary correlation method and obtain compact expressions for the covariance. We illustrate the universality of this measure by presenting the spectral analysis of the quantum kicked rotors for the time-reversal invariant and time-reversal noninvariant cases. A local version of the parametric number variance introduced earlier is also investigated.
Covariant version of Verlinde's emergent gravity
NASA Astrophysics Data System (ADS)
Hossenfelder, Sabine
2017-06-01
A generally covariant version of Erik Verlinde's emergent gravity model is proposed. The Lagrangian constructed here allows an improved interpretation of the underlying mechanism. It suggests that de Sitter space is filled with a vector field that couples to baryonic matter and, by dragging on it, creates an effect similar to dark matter. We solve the covariant equation of motion in the background of a Schwarzschild space-time and obtain correction terms to the noncovariant expression. Furthermore, we demonstrate that the vector field can also mimic dark energy.
Incorporating covariates into integrated factor analysis of multi-view data.
Li, Gen; Jung, Sungkyu
2017-04-13
In modern biomedical research, it is ubiquitous to have multiple data sets measured on the same set of samples from different views (i.e., multi-view data). For example, in genetic studies, multiple genomic data sets at different molecular levels or from different cell types are measured for a common set of individuals to investigate genetic regulation. Integration and reduction of multi-view data have the potential to leverage information in different data sets, and to reduce the magnitude and complexity of data for further statistical analysis and interpretation. In this article, we develop a novel statistical model, called supervised integrated factor analysis (SIFA), for integrative dimension reduction of multi-view data while incorporating auxiliary covariates. The model decomposes data into joint and individual factors, capturing the joint variation across multiple data sets and the individual variation specific to each set, respectively. Moreover, both joint and individual factors are partially informed by auxiliary covariates via nonparametric models. We devise a computationally efficient Expectation-Maximization (EM) algorithm to fit the model under some identifiability conditions. We apply the method to the Genotype-Tissue Expression (GTEx) data, and provide new insights into the variation decomposition of gene expression in multiple tissues. Extensive simulation studies and an additional application to a pediatric growth study demonstrate the advantage of the proposed method over competing methods.
Impacts of data covariances on the calculated breeding ratio for CRBRP
Liaw, J.R.; Collins, P.J.; Henryson, H. II; Shenter, R.E.
1983-01-01
In order to establish confidence on the data adjustment methodology as applied to LMFBR design, and to estimate the importance of data correlations in that respect, an investigation was initiated on the impacts of data covariances on the calculated reactor performance parameters. This paper summarizes the results and findings of such an effort specifically related to the calculation of breeding ratio for CRBRP as an illustration. Thirty-nine integral parameters and their covariances, including k/sub eff/ and various capture and fission reaction rate ratios, from the ZEBRA-8 series and four ZPR physics benchmark assemblies were used in the least-squares fitting processes. Multigroup differential data and the sensitivity coefficients of those 39 integral parameters were generated by standard 2-D diffusion theory neutronic calculational modules at ANL. Three differential data covariance libraries, all based on ENDF/B-V evaluations, were tested in this study.
McCandless, Lawrence C; Gustafson, Paul; Austin, Peter C; Levy, Adrian R
2009-09-10
Regression adjustment for the propensity score is a statistical method that reduces confounding from measured variables in observational data. A Bayesian propensity score analysis extends this idea by using simultaneous estimation of the propensity scores and the treatment effect. In this article, we conduct an empirical investigation of the performance of Bayesian propensity scores in the context of an observational study of the effectiveness of beta-blocker therapy in heart failure patients. We study the balancing properties of the estimated propensity scores. Traditional Frequentist propensity scores focus attention on balancing covariates that are strongly associated with treatment. In contrast, we demonstrate that Bayesian propensity scores can be used to balance the association between covariates and the outcome. This balancing property has the effect of reducing confounding bias because it reduces the degree to which covariates are outcome risk factors.
Testing interaction between treatment and high-dimensional covariates in randomized clinical trials.
Callegaro, Andrea; Spiessens, Bart; Dizier, Benjamin; Montoya, Fernando U; van Houwelingen, Hans C
2016-10-20
In this paper, we considered different methods to test the interaction between treatment and a potentially large number (p) of covariates in randomized clinical trials. The simplest approach was to fit univariate (marginal) models and to combine the univariate statistics or p-values (e.g., minimum p-value). Another possibility was to reduce the dimension of the covariates using the principal components (PCs) and to test the interaction between treatment and PCs. Finally, we considered the Goeman global test applied to the high-dimensional interaction matrix, adjusted for the main (treatment and covariates) effects. These tests can be used for personalized medicine to test if a large set of biomarkers can be useful to identify a subset of patients who may be more responsive to treatment. We evaluated the performance of these methods on simulated data and we applied them on data from two early phases oncology clinical trials.
ERIC Educational Resources Information Center
Samejima, Fumiko
1998-01-01
Introduces and discusses the rationale and procedures of two nonparametric approaches to estimating the operating characteristic of a discrete item response, or the conditional probability, given the latent trait, that the examinee's response be that specific response. (SLD)
Brier, Matthew R.; Mitra, Anish; McCarthy, John E.; Ances, Beau M.; Snyder, Abraham Z.
2015-01-01
Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. PMID:26208872
Brier, Matthew R; Mitra, Anish; McCarthy, John E; Ances, Beau M; Snyder, Abraham Z
2015-11-01
Functional connectivity refers to shared signals among brain regions and is typically assessed in a task free state. Functional connectivity commonly is quantified between signal pairs using Pearson correlation. However, resting-state fMRI is a multivariate process exhibiting a complicated covariance structure. Partial covariance assesses the unique variance shared between two brain regions excluding any widely shared variance, hence is appropriate for the analysis of multivariate fMRI datasets. However, calculation of partial covariance requires inversion of the covariance matrix, which, in most functional connectivity studies, is not invertible owing to rank deficiency. Here we apply Ledoit-Wolf shrinkage (L2 regularization) to invert the high dimensional BOLD covariance matrix. We investigate the network organization and brain-state dependence of partial covariance-based functional connectivity. Although RSNs are conventionally defined in terms of shared variance, removal of widely shared variance, surprisingly, improved the separation of RSNs in a spring embedded graphical model. This result suggests that pair-wise unique shared variance plays a heretofore unrecognized role in RSN covariance organization. In addition, application of partial correlation to fMRI data acquired in the eyes open vs. eyes closed states revealed focal changes in uniquely shared variance between the thalamus and visual cortices. This result suggests that partial correlation of resting state BOLD time series reflect functional processes in addition to structural connectivity. Copyright © 2015 Elsevier Inc. All rights reserved.
Nonparametric analysis of Minnesota spruce and aspen tree data and LANDSAT data
NASA Technical Reports Server (NTRS)
Scott, D. W.; Jee, R.
1984-01-01
The application of nonparametric methods in data-intensive problems faced by NASA is described. The theoretical development of efficient multivariate density estimators and the novel use of color graphics workstations are reviewed. The use of nonparametric density estimates for data representation and for Bayesian classification are described and illustrated. Progress in building a data analysis system in a workstation environment is reviewed and preliminary runs presented.
Fast Nonparametric Machine Learning Algorithms for High-Dimensional Massive Data and Applications
2006-03-01
Mapreduce : Simplified data processing on large clusters . In Symposium on Operating System Design and Implementation, 2004. 6.3.2 S. C. Deerwester, S. T...Fast Nonparametric Machine Learning Algorithms for High-dimensional Massive Data and Applications Ting Liu CMU-CS-06-124 March 2006 School of...4. TITLE AND SUBTITLE Fast Nonparametric Machine Learning Algorithms for High-dimensional Massive Data and Applications 5a. CONTRACT NUMBER 5b
Jung, S H; Su, J Q
1995-02-15
We propose a non-parametric method to calculate a confidence interval for the difference or ratio of two median failure times for paired observations with censoring. The new method is simple to calculate, does not involve non-parametric density estimates, and is valid asymptotically even when the two underlying distribution functions differ in shape. The method also allows missing observations. We report numerical studies to examine the performance of the new method for practical sample sizes.
Nonparametric analysis of Minnesota spruce and aspen tree data and LANDSAT data
NASA Technical Reports Server (NTRS)
Scott, D. W.; Jee, R.
1984-01-01
The application of nonparametric methods in data-intensive problems faced by NASA is described. The theoretical development of efficient multivariate density estimators and the novel use of color graphics workstations are reviewed. The use of nonparametric density estimates for data representation and for Bayesian classification are described and illustrated. Progress in building a data analysis system in a workstation environment is reviewed and preliminary runs presented.
A non-parametric model for the cosmic velocity field
NASA Astrophysics Data System (ADS)
Branchini, E.; Teodoro, L.; Frenk, C. S.; Schmoldt, I.; Efstathiou, G.; White, S. D. M.; Saunders, W.; Sutherland, W.; Rowan-Robinson, M.; Keeble, O.; Tadros, H.; Maddox, S.; Oliver, S.
1999-09-01
We present a self-consistent non-parametric model of the local cosmic velocity field derived from the distribution of IRAS galaxies in the PSCz redshift survey. The survey has been analysed using two independent methods, both based on the assumptions of gravitational instability and linear biasing. The two methods, which give very similar results, have been tested and calibrated on mock PSCz catalogues constructed from cosmological N-body simulations. The denser sampling provided by the PSCz survey compared with previous IRAS galaxy surveys allows an improved reconstruction of the density and velocity fields out to large distances. The most striking feature of the model velocity field is a coherent large-scale streaming motion along the baseline connecting Perseus-Pisces, the Local Supercluster, the Great Attractor and the Shapley Concentration. We find no evidence for back-infall on to the Great Attractor. Instead, material behind and around the Great Attractor is inferred to be streaming towards the Shapley Concentration, aided by the compressional push of two large nearby underdensities. The PSCz model velocities compare well with those predicted from the 1.2-Jy redshift survey of IRAS galaxies and, perhaps surprisingly, with those predicted from the distribution of Abell/ACO clusters, out to 140h^-1Mpc. Comparison of the real-space density fields (or, alternatively, the peculiar velocity fields) inferred from the PSCz and cluster catalogues gives a relative (linear) bias parameter between clusters and IRAS galaxies of b_c=4.4+/-0.6. Finally, we implement a likelihood analysis that uses all the available information on peculiar velocities in our local Universe to estimate beta_Omega 0 0.6 b_0.6 -0.15 +0.22 (1sigma), where b is the bias parameter for IRAS galaxies.
A robust nonparametric method for quantifying undetected extinctions.
Chisholm, Ryan A; Giam, Xingli; Sadanandan, Keren R; Fung, Tak; Rheindt, Frank E
2016-06-01
How many species have gone extinct in modern times before being described by science? To answer this question, and thereby get a full assessment of humanity's impact on biodiversity, statistical methods that quantify undetected extinctions are required. Such methods have been developed recently, but they are limited by their reliance on parametric assumptions; specifically, they assume the pools of extant and undetected species decay exponentially, whereas real detection rates vary temporally with survey effort and real extinction rates vary with the waxing and waning of threatening processes. We devised a new, nonparametric method for estimating undetected extinctions. As inputs, the method requires only the first and last date at which each species in an ensemble was recorded. As outputs, the method provides estimates of the proportion of species that have gone extinct, detected, or undetected and, in the special case where the number of undetected extant species in the present day is assumed close to zero, of the absolute number of undetected extinct species. The main assumption of the method is that the per-species extinction rate is independent of whether a species has been detected or not. We applied the method to the resident native bird fauna of Singapore. Of 195 recorded species, 58 (29.7%) have gone extinct in the last 200 years. Our method projected that an additional 9.6 species (95% CI 3.4, 19.8) have gone extinct without first being recorded, implying a true extinction rate of 33.0% (95% CI 31.0%, 36.2%). We provide R code for implementing our method. Because our method does not depend on strong assumptions, we expect it to be broadly useful for quantifying undetected extinctions.
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2008-01-01
Sustained increases in energy prices have focused attention on gas resources in low-permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are often large. Planning and development decisions for extraction of such resources must be areawide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm, the decision to enter such plays depends on reconnaissance-level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional-scale cost functions. The context of the worked example is the Devonian Antrim-shale gas play in the Michigan basin. One finding relates to selection of the resource prediction model to be used with economic models. Models chosen because they can best predict aggregate volume over larger areas (many hundreds of sites) smooth out granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined arbitrarily by extraneous factors. The analysis shows a 15-20% gain in gas volume when these simple models are applied to order drilling prospects strategically rather than to choose drilling locations randomly. Copyright ?? 2008 Society of Petroleum Engineers.
Economic decision making and the application of nonparametric prediction models
Attanasi, E.D.; Coburn, T.C.; Freeman, P.A.
2007-01-01
Sustained increases in energy prices have focused attention on gas resources in low permeability shale or in coals that were previously considered economically marginal. Daily well deliverability is often relatively small, although the estimates of the total volumes of recoverable resources in these settings are large. Planning and development decisions for extraction of such resources must be area-wide because profitable extraction requires optimization of scale economies to minimize costs and reduce risk. For an individual firm the decision to enter such plays depends on reconnaissance level estimates of regional recoverable resources and on cost estimates to develop untested areas. This paper shows how simple nonparametric local regression models, used to predict technically recoverable resources at untested sites, can be combined with economic models to compute regional scale cost functions. The context of the worked example is the Devonian Antrim shale gas play, Michigan Basin. One finding relates to selection of the resource prediction model to be used with economic models. Models which can best predict aggregate volume over larger areas (many hundreds of sites) may lose granularity in the distribution of predicted volumes at individual sites. This loss of detail affects the representation of economic cost functions and may affect economic decisions. Second, because some analysts consider unconventional resources to be ubiquitous, the selection and order of specific drilling sites may, in practice, be determined by extraneous factors. The paper also shows that when these simple prediction models are used to strategically order drilling prospects, the gain in gas volume over volumes associated with simple random site selection amounts to 15 to 20 percent. It also discusses why the observed benefit of updating predictions from results of new drilling, as opposed to following static predictions, is somewhat smaller. Copyright 2007, Society of Petroleum Engineers.
Nonparametric Bayesian inference of the microcanonical stochastic block model
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
2017-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models and then infer their parameters from data. When the desired structure is composed of modules or "communities," a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints, i.e., the generated networks are not allowed to violate the patterns imposed by the model. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: (1) deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, which not only remove limitations that seriously degrade the inference on large networks but also reveal structures at multiple scales; (2) a very efficient inference algorithm that scales well not only for networks with a large number of nodes and edges but also with an unlimited number of modules. We show also how this approach can be used to sample modular hierarchies from the posterior distribution, as well as to perform model selection. We discuss and analyze the differences between sampling from the posterior and simply finding the single parameter estimate that maximizes it. Furthermore, we expose a direct equivalence between our microcanonical approach and alternative derivations based on the canonical SBM.
Akhtar, Naveed; Mian, Ajmal
2017-10-03
We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
kdetrees: non-parametric estimation of phylogenetic tree distributions
Weyenberg, Grady; Huggins, Peter M.; Schardl, Christopher L.; Howe, Daniel K.; Yoshida, Ruriko
2014-01-01
Motivation: Although the majority of gene histories found in a clade of organisms are expected to be generated by a common process (e.g. the coalescent process), it is well known that numerous other coexisting processes (e.g. horizontal gene transfers, gene duplication and subsequent neofunctionalization) will cause some genes to exhibit a history distinct from those of the majority of genes. Such ‘outlying’ gene trees are considered to be biologically interesting, and identifying these genes has become an important problem in phylogenetics. Results: We propose and implement kdetrees, a non-parametric method for estimating distributions of phylogenetic trees, with the goal of identifying trees that are significantly different from the rest of the trees in the sample. Our method compares favorably with a similar recently published method, featuring an improvement of one polynomial order of computational complexity (to quadratic in the number of trees analyzed), with simulation studies suggesting only a small penalty to classification accuracy. Application of kdetrees to a set of Apicomplexa genes identified several unreliable sequence alignments that had escaped previous detection, as well as a gene independently reported as a possible case of horizontal gene transfer. We also analyze a set of Epichloë genes, fungi symbiotic with grasses, successfully identifying a contrived instance of paralogy. Availability and implementation: Our method for estimating tree distributions and identifying outlying trees is implemented as the R package kdetrees and is available for download from CRAN. Contact: ruriko.yoshida@uky.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:24764459
Optimizing performance of nonparametric species richness estimators under constrained sampling.
Rajakaruna, Harshana; Drake, D Andrew R; T Chan, Farrah; Bailey, Sarah A
2016-10-01
Understanding the functional relationship between the sample size and the performance of species richness estimators is necessary to optimize limited sampling resources against estimation error. Nonparametric estimators such as Chao and Jackknife demonstrate strong performances, but consensus is lacking as to which estimator performs better under constrained sampling. We explore a method to improve the estimators under such scenario. The method we propose involves randomly splitting species-abundance data from a single sample into two equally sized samples, and using an appropriate incidence-based estimator to estimate richness. To test this method, we assume a lognormal species-abundance distribution (SAD) with varying coefficients of variation (CV), generate samples using MCMC simulations, and use the expected mean-squared error as the performance criterion of the estimators. We test this method for Chao, Jackknife, ICE, and ACE estimators. Between abundance-based estimators with the single sample, and incidence-based estimators with the split-in-two samples, Chao2 performed the best when CV < 0.65, and incidence-based Jackknife performed the best when CV > 0.65, given that the ratio of sample size to observed species richness is greater than a critical value given by a power function of CV with respect to abundance of the sampled population. The proposed method increases the performance of the estimators substantially and is more effective when more rare species are in an assemblage. We also show that the splitting method works qualitatively similarly well when the SADs are log series, geometric series, and negative binomial. We demonstrate an application of the proposed method by estimating richness of zooplankton communities in samples of ballast water. The proposed splitting method is an alternative to sampling a large number of individuals to increase the accuracy of richness estimations; therefore, it is appropriate for a wide range of resource
Observed Score Linear Equating with Covariates
ERIC Educational Resources Information Center
Branberg, Kenny; Wiberg, Marie
2011-01-01
This paper examined observed score linear equating in two different data collection designs, the equivalent groups design and the nonequivalent groups design, when information from covariates (i.e., background variables correlated with the test scores) was included. The main purpose of the study was to examine the effect (i.e., bias, variance, and…
Covariant formulation of pion-nucleon scattering
NASA Astrophysics Data System (ADS)
Lahiff, A. D.; Afnan, I. R.
A covariant model of elastic pion-nucleon scattering based on the Bethe-Salpeter equation is presented. The kernel consists of s- and u-channel nucleon and delta poles, along with rho and sigma exchange in the t-channel. A good fit is obtained to the s- and p-wave phase shifts up to the two-pion production threshold.
Observed Score Linear Equating with Covariates
ERIC Educational Resources Information Center
Branberg, Kenny; Wiberg, Marie
2011-01-01
This paper examined observed score linear equating in two different data collection designs, the equivalent groups design and the nonequivalent groups design, when information from covariates (i.e., background variables correlated with the test scores) was included. The main purpose of the study was to examine the effect (i.e., bias, variance, and…
Scale covariant gravitation. V. Kinetic theory
Hsieh, S.; Canuto, V.M.
1981-09-01
In this paper we construct a scale covariant kinetic theory for particles and photons. The mathematical framework of the theory is given by the tangent bundle of a Weyl manifold. The Liouville equation is then derived. Solutions corresponding to equilibrium distributions are presented and shown to yield thermodynamic results identical to the ones obtained previously.
Analysis of Covariance: A Proposed Algorithm.
ERIC Educational Resources Information Center
Frigon, Jean-Yves; Laurencelle, Louis
1993-01-01
The statistical power of analysis of covariance (ANCOVA) and its advantages over simple analysis of variance are examined in some experimental situations, and an algorithm is proposed for its proper application. In nonrandomized experiments, an ANCOVA is generally not a good approach. (SLD)
Economical phase-covariant cloning of qudits
Buscemi, Francesco; D'Ariano, Giacomo Mauro; Macchiavello, Chiara
2005-04-01
We derive the optimal N{yields}M phase-covariant quantum cloning for equatorial states in dimension d with M=kd+N, k integer. The cloning maps are optimal for both global and single-qudit fidelity. The map is achieved by an 'economical' cloning machine, which works without ancilla.
Covariant brackets for particles and fields
NASA Astrophysics Data System (ADS)
Asorey, M.; Ciaglia, M.; di Cosmo, F.; Ibort, A.
2017-06-01
A geometrical approach to the covariant formulation of the dynamics of relativistic systems is introduced. A realization of Peierls brackets by means of a bivector field over the space of solutions of the Euler-Lagrange equations of a variational principle is presented. The method is illustrated with some relevant examples.
A novel nonparametric confidence interval for differences of proportions for correlated binary data.
Duan, Chongyang; Cao, Yingshu; Zhou, Lizhi; Tan, Ming T; Chen, Pingyan
2016-11-16
Various confidence interval estimators have been developed for differences in proportions resulted from correlated binary data. However, the width of the mostly recommended Tango's score confidence interval tends to be wide, and the computing burden of exact methods recommended for small-sample data is intensive. The recently proposed rank-based nonparametric method by treating proportion as special areas under receiver operating characteristic provided a new way to construct the confidence interval for proportion difference on paired data, while the complex computation limits its application in practice. In this article, we develop a new nonparametric method utilizing the U-statistics approach for comparing two or more correlated areas under receiver operating characteristics. The new confidence interval has a simple analytic form with a new estimate of the degrees of freedom of n - 1. It demonstrates good coverage properties and has shorter confidence interval widths than that of Tango. This new confidence interval with the new estimate of degrees of freedom also leads to coverage probabilities that are an improvement on the rank-based nonparametric confidence interval. Comparing with the approximate exact unconditional method, the nonparametric confidence interval demonstrates good coverage properties even in small samples, and yet they are very easy to implement computationally. This nonparametric procedure is evaluated using simulation studies and illustrated with three real examples. The simplified nonparametric confidence interval is an appealing choice in practice for its ease of use and good performance. © The Author(s) 2016.
Schwartz, Daniel M
2003-01-01
PURPOSE: First, to determine whether a silicone light-adjustable intraocular lens (IOL) can be fabricated and adjusted precisely with a light delivery device (LDD). Second, to determine the biocompatibility of an adjustable IOL and whether the lens can be adjusted precisely in vivo. METHODS: After fabrication of a light-adjustable silicone formulation, IOLs were made and tested in vitro for cytotoxicity, leaching, precision of adjustment, optical quality after adjustment, and mechanical properties. Light-adjustable IOLs were then tested in vivo for biocompatibility and precision of adjustment in a rabbit model. In collaboration with Zeiss-Meditec, a digital LDD was developed and tested to correct for higher-order aberrations in light-adjustable IOLs. RESULTS: The results establish that a biocompatible silicone IOL can be fabricated and adjusted using safe levels of light. There was no evidence of cytotoxicity or leaching. Testing of mechanical properties revealed no significant differences from commercial controls. Implantation of light-adjustable lenses in rabbits demonstrated- excellent biocompatibility after 6 months, comparable to a commercially available IOL. In vivo spherical (hyperopic and myopic) adjustment in rabbits was achieved using an analog light delivery system. The digital light delivery system was tested and achieved correction of higher-order aberrations. CONCLUSION: A silicone light-adjustable IOL and LDD have been developed to enable postoperative, noninvasive adjustment of lens power. The ability to correct higher-order aberrations in these materials has broad potential applicability for optimization of vision in patients undergoing cataract and refractive surgery. PMID:14971588
Covariance modeling in geodetic applications of collocation
NASA Astrophysics Data System (ADS)
Barzaghi, Riccardo; Cazzaniga, Noemi; De Gaetani, Carlo; Reguzzoni, Mirko
2014-05-01
Collocation method is widely applied in geodesy for estimating/interpolating gravity related functionals. The crucial problem of this approach is the correct modeling of the empirical covariance functions of the observations. Different methods for getting reliable covariance models have been proposed in the past by many authors. However, there are still problems in fitting the empirical values, particularly when different functionals of T are used and combined. Through suitable linear combinations of positive degree variances a model function that properly fits the empirical values can be obtained. This kind of condition is commonly handled by solver algorithms in linear programming problems. In this work the problem of modeling covariance functions has been dealt with an innovative method based on the simplex algorithm. This requires the definition of an objective function to be minimized (or maximized) where the unknown variables or their linear combinations are subject to some constraints. The non-standard use of the simplex method consists in defining constraints on model covariance function in order to obtain the best fit on the corresponding empirical values. Further constraints are applied so to have coherence with model degree variances to prevent possible solutions with no physical meaning. The fitting procedure is iterative and, in each iteration, constraints are strengthened until the best possible fit between model and empirical functions is reached. The results obtained during the test phase of this new methodology show remarkable improvements with respect to the software packages available until now. Numerical tests are also presented to check for the impact that improved covariance modeling has on the collocation estimate.
Blamires, Sean J; Hasemore, Matthew; Martens, Penny J; Kasumovic, Michael M
2017-03-01
The adaptive benefits of extended phenotypic plasticity are imprecisely defined due to a paucity of experiments examining traits that are manipulable and measurable across environments. Spider webs are often used as models to explore the adaptive benefits of variations in extended phenotypes across environments. Nonetheless, our understanding of the adaptive nature of the plastic responses of spider webs is impeded when web architectures and silk physicochemical properties appear to co-vary. An opportunity to examine this co-variation is presented by modifying prey items while measuring web architectures and silk physiochemical properties. Here, we performed two experiments to assess the nature of the association between web architectures and gluey silk properties when the orb web spider Argiope keyserlingi was fed a diet that varied in either mass and energy or prey size and feeding frequency. We found web architectures and gluey silk physicochemical properties to co-vary across treatments in both experiments. Specifically, web capture area co-varied with gluey droplet morphometrics, thread stickiness and salt concentrations when prey mass and energy were manipulated, and spiral spacing co-varied with gluey silk salt concentrations when prey size and feeding frequency were manipulated. We explained our results as A. keyserlingi plastically shifting its foraging strategy as multiple prey parameters simultaneously varied. We confirmed and extended previous work by showing that spiders use a variety of prey cues to concurrently adjust web and silk traits across different feeding regimes.
Effects of expanding the look-back period to all available data in the assessment of covariates.
Nakasian, Sonja S; Rassen, Jeremy A; Franklin, Jessica M
2017-08-01
A fixed baseline period has been a common covariate assessment approach in pharmacoepidemiological studies from claims but may lead to high levels of covariate misclassification. Simulation studies have recommended expanding the look-back approach to all available data (AAD) for binary indicators of diagnoses, procedures, and medications, but there have been few real data analyses using this approach. The objective of the study is to explore the impact on treatment effect estimates and covariate prevalence of expanding the look-back period within five validated studies in the Aetion system, a rapid cycle analytics platform. We reran the five studies and assessed covariates using (i) a fixed window approach (usually 180 days before treatment initiation), (ii) AAD prior to treatment initiation, and (iii) AAD with a categorized by recency approach, where the most recent occurrence of a covariate was labeled as recent (occurring within the fixed window) or past (before the start of the fixed window). For each covariate assessment approach, we adjusted for covariates via propensity score matching. All studies had at least one covariate that had an increase in prevalence of 15% or higher from the fixed window to the AAD approach. However, there was little change in treatment effect estimates resulting from differing covariate assessment approaches. For example, in a study of acute coronary syndrome in high-intensity versus low-intensity statin users, the estimated hazard ratio from the fixed window approach was 1.11 (95% confidence interval 0.98, 1.25) versus 1.21 (1.07, 1.37) when using AAD and 1.19 (1.05, 1.35) using categorized by recency. Expanding the baseline period to AAD improved covariate sensitivity by capturing data that would otherwise be missed yet did not meaningfully change the overall treatment effect estimates compared with the fixed window approach. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Diallo, Thierno M O; Morin, Alexandre J S; Lu, HuiZhong
2017-03-01
This article evaluates the impact of partial or total covariate inclusion or exclusion on the class enumeration performance of growth mixture models (GMMs). Study 1 examines the effect of including an inactive covariate when the population model is specified without covariates. Study 2 examines the case in which the population model is specified with 2 covariates influencing only the class membership. Study 3 examines a population model including 2 covariates influencing the class membership and the growth factors. In all studies, we contrast the accuracy of various indicators to correctly identify the number of latent classes as a function of different design conditions (sample size, mixing ratio, invariance or noninvariance of the variance-covariance matrix, class separation, and correlations between the covariates in Studies 2 and 3) and covariate specification (exclusion, partial or total inclusion as influencing class membership, partial or total inclusion as influencing class membership, and the growth factors in a class-invariant or class-varying manner). The accuracy of the indicators shows important variation across studies, indicators, design conditions, and specification of the covariates effects. However, the results suggest that the GMM class enumeration process should be conducted without covariates, and should rely mostly on the Bayesian information criterion (BIC) and consistent Akaike information criterion (CAIC) as the most reliable indicators under conditions of high class separation (as indicated by higher entropy), versus the sample size adjusted BIC or CAIC (SBIC, SCAIC) and bootstrapped likelihood ratio test (BLRT) under conditions of low class separation (indicated by lower entropy). (PsycINFO Database Record
A covariance NMR toolbox for MATLAB and OCTAVE.
Short, Timothy; Alzapiedi, Leigh; Brüschweiler, Rafael; Snyder, David
2011-03-01
The Covariance NMR Toolbox is a new software suite that provides a streamlined implementation of covariance-based analysis of multi-dimensional NMR data. The Covariance NMR Toolbox uses the MATLAB or, alternatively, the freely available GNU OCTAVE computer language, providing a user-friendly environment in which to apply and explore covariance techniques. Covariance methods implemented in the toolbox described here include direct and indirect covariance processing, 4D covariance, generalized indirect covariance (GIC), and Z-matrix transform. In order to provide compatibility with a wide variety of spectrometer and spectral analysis platforms, the Covariance NMR Toolbox uses the NMRPipe format for both input and output files. Additionally, datasets small enough to fit in memory are stored as arrays that can be displayed and further manipulated in a versatile manner within MATLAB or OCTAVE.
Austin, Peter C
2009-01-01
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile–quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative. Copyright © 2009 John Wiley & Sons, Ltd. PMID:19757444
Covariances of Evaluated Nuclear Cross Section Data for (232)Th, (180,182,183,184,186)W and (55)Mn
Trkov, A.; Capote, R.; Soukhovitskii, E; Leal, Luiz C; Sin, M; Kodeli, I.; Muir, D W
2011-01-01
The EMPIRE code system is a versatile package for nuclear model calculations that is often used for nuclear data evaluation. Its capabilities include random sampling of model parameters, which can be utilized to generate a full covariance matrix of all scattering cross sections, including cross-reaction correlations. The EMPIRE system was used to prepare the prior covariance matrices of reaction cross sections of (232)Th, (180,182,183,184,186)W and (55)Mn nuclei for incident neutron energies up to 60 MeV. The obtained modeling prior was fed to the GANDR system, which is a package for a global assessment of nuclear data, based on the Generalized Least-Squares method. By introducing experimental data from the EXFOR database into GANDR, the constrained covariance matrices and cross section adjustment functions were obtained. Applying the correction functions on the cross sections and formatting the covariance matrices, the final evaluations in ENDF-6 format including covariances were derived. In the resonance energy range, separate analyses were performed to determine the resonance parameters with their respective covariances. The data files thus obtained were then subjected to detailed testing and validation. Described evaluations with covariances of (232)Th, (180,182,183,184,186)W and (55)Mn nuclei are included into the ENDF/B-VII.1 library release.
Covariances of Evaluated Nuclear Cross Section Data for 232Th, 180,182,183,184,186W and 55Mn
NASA Astrophysics Data System (ADS)
Trkov, A.; Capote, R.; Soukhovitskii, E. Sh.; Leal, L. C.; Sin, M.; Kodeli, I.; Muir, D. W.
2011-12-01
The EMPIRE code system is a versatile package for nuclear model calculations that is often used for nuclear data evaluation. Its capabilities include random sampling of model parameters, which can be utilised to generate a full covariance matrix of all scattering cross sections, including cross-reaction correlations. The EMPIRE system was used to prepare the prior covariance matrices of reaction cross sections of 232Th, 180,182,183,184,186W and 55Mn nuclei for incident neutron energies up to 60 MeV. The obtained modelling prior was fed to the GANDR system, which is a package for a global assessment of nuclear data, based on the Generalised Least-Squares method. By introducing experimental data from the EXFOR database into GANDR, the constrained covariance matrices and cross section adjustment functions were obtained. Applying the correction functions on the cross sections and formatting the covariance matrices, the final evaluations in ENDF-6 format including covariances were derived. In the resonance energy range, separate analyses were performed to determine the resonance parameters with their respective covariances. The data files thus obtained were then subjected to detailed testing and validation. Described evaluations with covariances of 232Th, 180,182,183,184,186W and 55Mn nuclei are included into the ENDF/B-VII.1 library release.
An adaptive distance measure for use with nonparametric models
Garvey, D. R.; Hines, J. W.
2006-07-01
Distance measures perform a critical task in nonparametric, locally weighted regression. Locally weighted regression (LWR) models are a form of 'lazy learning' which construct a local model 'on the fly' by comparing a query vector to historical, exemplar vectors according to a three step process. First, the distance of the query vector to each of the exemplar vectors is calculated. Next, these distances are passed to a kernel function, which converts the distances to similarities or weights. Finally, the model output or response is calculated by performing locally weighted polynomial regression. To date, traditional distance measures, such as the Euclidean, weighted Euclidean, and L1-norm have been used as the first step in the prediction process. Since these measures do not take into consideration sensor failures and drift, they are inherently ill-suited for application to 'real world' systems. This paper describes one such LWR model, namely auto associative kernel regression (AAKR), and describes a new, Adaptive Euclidean distance measure that can be used to dynamically compensate for faulty sensor inputs. In this new distance measure, the query observations that lie outside of the training range (i.e. outside the minimum and maximum input exemplars) are dropped from the distance calculation. This allows for the distance calculation to be robust to sensor drifts and failures, in addition to providing a method for managing inputs that exceed the training range. In this paper, AAKR models using the standard and Adaptive Euclidean distance are developed and compared for the pressure system of an operating nuclear power plant. It is shown that using the standard Euclidean distance for data with failed inputs, significant errors in the AAKR predictions can result. By using the Adaptive Euclidean distance it is shown that high fidelity predictions are possible, in spite of the input failure. In fact, it is shown that with the Adaptive Euclidean distance prediction
Wynant, Willy; Abrahamowicz, Michal
2014-08-30
Cox's proportional hazards (PH) model assumes constant-over-time covariate effects. Furthermore, most applications assume linear effects of continuous covariates on the logarithm of the hazard. Yet, many prognostic factors have time-dependent (TD) and/or nonlinear (NL) effects, that is, violate these conventional assumptions. Detection of such complex effects could affect prognosis and clinical decisions. However, assessing the effects of each of the multiple, often correlated, covariates in flexible multivariable analyses is challenging. In simulations, we investigated the impact of the approach used to build the flexible multivariable model on inference about the TD and NL covariate effects. Results demonstrate that the conclusions regarding the statistical significance of the TD/NL effects depend heavily on the strategy used to decide which effects of the other covariates should be adjusted for. Both a failure to adjust for true TD and NL effects of relevant covariates and inclusion of spurious effects of covariates that conform to the PH and linearity assumptions increase the risk of incorrect conclusions regarding other covariates. In this context, iterative backward elimination of nonsignificant NL and TD effects from the multivariable model, which initially includes all these effects, may help discriminate between true and spurious effects. The practical importance of these issues was illustrated in an example that reassessed the predictive ability of selected biomarkers for survival in advanced non-small-cell lung cancer. In conclusion, a careful model-building strategy and flexible modeling of multivariable survival data can yield new insights about predictors' roles and improve the validity of analyses.
RNA sequence analysis using covariance models.
Eddy, S R; Durbin, R
1994-01-01
We describe a general approach to several RNA sequence analysis problems using probabilistic models that flexibly describe the secondary structure and primary sequence consensus of an RNA sequence family. We call these models 'covariance models'. A covariance model of tRNA sequences is an extremely sensitive and discriminative tool for searching for additional tRNAs and tRNA-related sequences in sequence databases. A model can be built automatically from an existing sequence alignment. We also describe an algorithm for learning a model and hence a consensus secondary structure from initially unaligned example sequences and no prior structural information. Models trained on unaligned tRNA examples correctly predict tRNA secondary structure and produce high-quality multiple alignments. The approach may be applied to any family of small RNA sequences. Images PMID:8029015
Lorentz Covariant Distributions with Spectral Conditions
Zinoviev, Yury M.
2007-11-14
The properties of the vacuum expectation values of products of the quantum fields are formulated in the book [1]. The vacuum expectation values of quantum fields products would be the Fourier transforms of the Lorentz covariant tempered distributions with supports in the product of the closed upper light cones. Lorentz invariant distributions are studied in the papers [2]--[4]. The authors of these papers wanted to describe Lorentz invariant distributions in terms of distributions given on the Lorentz group orbit space. This orbit space has a complicated structure. It is noted [5] that a tempered distribution with support in the closed upper light cone may be represented as the action of the wave operator in some power on a differentiable function with support in the closed upper light cone. For the description of the Lorentz covariant differentiable functions the boundary of the closed upper light cone is not important. The measure of this boundary is zero.
On covariance structure in noisy, big data
NASA Astrophysics Data System (ADS)
Paffenroth, Randy C.; Nong, Ryan; Du Toit, Philip C.
2013-09-01
Herein we describe theory and algorithms for detecting covariance structures in large, noisy data sets. Our work uses ideas from matrix completion and robust principal component analysis to detect the presence of low-rank covariance matrices, even when the data is noisy, distorted by large corruptions, and only partially observed. In fact, the ability to handle partial observations combined with ideas from randomized algorithms for matrix decomposition enables us to produce asymptotically fast algorithms. Herein we will provide numerical demonstrations of the methods and their convergence properties. While such methods have applicability to many problems, including mathematical finance, crime analysis, and other large-scale sensor fusion problems, our inspiration arises from applying these methods in the context of cyber network intrusion detection.
Linear transformations of variance/covariance matrices.
Parois, Pascal; Lutz, Martin
2011-07-01
Many applications in crystallography require the use of linear transformations on parameters and their standard uncertainties. While the transformation of the parameters is textbook knowledge, the transformation of the standard uncertainties is more complicated and needs the full variance/covariance matrix. For the transformation of second-rank tensors it is suggested that the 3 × 3 matrix is re-written into a 9 × 1 vector. The transformation of the corresponding variance/covariance matrix is then straightforward and easily implemented into computer software. This method is applied in the transformation of anisotropic displacement parameters, the calculation of equivalent isotropic displacement parameters, the comparison of refinements in different space-group settings and the calculation of standard uncertainties of eigenvalues.
Covariant approach to parametrized cosmological perturbations
NASA Astrophysics Data System (ADS)
Tattersall, Oliver J.; Lagos, Macarena; Ferreira, Pedro G.
2017-09-01
We present a covariant formulation for constructing general quadratic actions for cosmological perturbations, invariant under a given set of gauge symmetries for a given field content. This approach allows us to analyze scalar, vector, and tensor perturbations at the same time in a straightforward manner. We apply the procedure to diffeomorphism invariant single-tensor, scalar-tensor, and vector-tensor theories and show explicitly the full covariant form of the quadratic actions in such cases, in addition to the actions determining the evolution of vector and tensor perturbations. We also discuss the role of the symmetry of the background in identifying the set of cosmologically relevant free parameters describing these classes of theories, including calculating the relevant free parameters for an axisymmetric Bianchi-I vacuum universe.
Chiral four-dimensional heterotic covariant lattices
NASA Astrophysics Data System (ADS)
Beye, Florian
2014-11-01
In the covariant lattice formalism, chiral four-dimensional heterotic string vacua are obtained from certain even self-dual lattices which completely decompose into a left-mover and a right-mover lattice. The main purpose of this work is to classify all right-mover lattices that can appear in such a chiral model, and to study the corresponding left-mover lattices using the theory of lattice genera. In particular, the Smith-Minkowski-Siegel mass formula is employed to calculate a lower bound on the number of left-mover lattices. Also, the known relationship between asymmetric orbifolds and covariant lattices is considered in the context of our classification.
Construction of Covariance Functions with Variable Length Fields
NASA Technical Reports Server (NTRS)
Gaspari, Gregory; Cohn, Stephen E.; Guo, Jing; Pawson, Steven
2005-01-01
This article focuses on construction, directly in physical space, of three-dimensional covariance functions parametrized by a tunable length field, and on an application of this theory to reproduce the Quasi-Biennial Oscillation (QBO) in the Goddard Earth Observing System, Version 4 (GEOS-4) data assimilation system. These Covariance models are referred to as multi-level or nonseparable, to associate them with the application where a multi-level covariance with a large troposphere to stratosphere length field gradient is used to reproduce the QBO from sparse radiosonde observations in the tropical lower stratosphere. The multi-level covariance functions extend well-known single level covariance functions depending only on a length scale. Generalizations of the first- and third-order autoregressive covariances in three dimensions are given, providing multi-level covariances with zero and three derivatives at zero separation, respectively. Multi-level piecewise rational covariances with two continuous derivatives at zero separation are also provided. Multi-level powerlaw covariances are constructed with continuous derivatives of all orders. Additional multi-level covariance functions are constructed using the Schur product of single and multi-level covariance functions. A multi-level powerlaw covariance used to reproduce the QBO in GEOS-4 is described along with details of the assimilation experiments. The new covariance model is shown to represent the vertical wind shear associated with the QBO much more effectively than in the baseline GEOS-4 system.
Inverse covariance simplification for efficient uncertainty management
NASA Astrophysics Data System (ADS)
Jalobeanu, A.; Gutiérrez, J. A.
2007-11-01
When it comes to manipulating uncertain knowledge such as noisy observations of physical quantities, one may ask how to do it in a simple way. Processing corrupted signals or images always propagates the uncertainties from the data to the final results, whether these errors are explicitly computed or not. When such error estimates are provided, it is crucial to handle them in such a way that their interpretation, or their use in subsequent processing steps, remain user-friendly and computationally tractable. A few authors follow a Bayesian approach and provide uncertainties as an inverse covariance matrix. Despite its apparent sparsity, this matrix contains many small terms that carry little information. Methods have been developed to select the most significant entries, through the use of information-theoretic tools for instance. One has to find a Gaussian pdf that is close enough to the posterior pdf, and with a small number of non-zero coefficients in the inverse covariance matrix. We propose to restrict the search space to Markovian models (where only neighbors can interact), well-suited to signals or images. The originality of our approach is in conserving the covariances between neighbors while setting to zero the entries of the inverse covariance matrix for all other variables. This fully constrains the solution, and the computation is performed via a fast, alternate minimization scheme involving quadratic forms. The Markovian structure advantageously reduces the complexity of Bayesian updating (where the simplified pdf is used as a prior). Moreover, uncertainties exhibit the same temporal or spatial structure as the data.
Covariance expressions for eigenvalue and eigenvector problems
NASA Astrophysics Data System (ADS)
Liounis, Andrew J.
There are a number of important scientific and engineering problems whose solutions take the form of an eigenvalue--eigenvector problem. Some notable examples include solutions to linear systems of ordinary differential equations, controllability of linear systems, finite element analysis, chemical kinetics, fitting ellipses to noisy data, and optimal estimation of attitude from unit vectors. In many of these problems, having knowledge of the eigenvalue and eigenvector Jacobians is either necessary or is nearly as important as having the solution itself. For instance, Jacobians are necessary to find the uncertainty in a computed eigenvalue or eigenvector estimate. This uncertainty, which is usually represented as a covariance matrix, has been well studied for problems similar to the eigenvalue and eigenvector problem, such as singular value decomposition. There has been substantially less research on the covariance of an optimal estimate originating from an eigenvalue-eigenvector problem. In this thesis we develop two general expressions for the Jacobians of eigenvalues and eigenvectors with respect to the elements of their parent matrix. The expressions developed make use of only the parent matrix and the eigenvalue and eigenvector pair under consideration. In addition, they are applicable to any general matrix (including complex valued matrices, eigenvalues, and eigenvectors) as long as the eigenvalues are simple. Alongside this, we develop expressions that determine the uncertainty in a vector estimate obtained from an eigenvalue-eigenvector problem given the uncertainty of the terms of the matrix. The Jacobian expressions developed are numerically validated with forward finite, differencing and the covariance expressions are validated using Monte Carlo analysis. Finally, the results from this work are used to determine covariance expressions for a variety of estimation problem examples and are also applied to the design of a dynamical system.
A covariant Lagrangian for stable nonsingular bounce
NASA Astrophysics Data System (ADS)
Cai, Yong; Piao, Yun-Song
2017-09-01
The nonsingular bounce models usually suffer from the ghost or gradient instabilities, as has been proved recently. In this paper, we propose a covariant effective theory for stable nonsingular bounce, which has the quadratic order of the second order derivative of the field ϕ but the background set only by P ( ϕ, X). With it, we explicitly construct a fully stable nonsingular bounce model for the ekpyrotic scenario.
Covariant quantization of the CBS superparticle
NASA Astrophysics Data System (ADS)
Grassi, P. A.; Policastro, G.; Porrati, M.
2001-07-01
The quantization of the Casalbuoni-Brink-Schwarz superparticle is performed in an explicitly covariant way using the antibracket formalism. Since an infinite number of ghost fields are required, within a suitable off-shell twistor-like formalism, we are able to fix the gauge of each ghost sector without modifying the physical content of the theory. The computation reveals that the antibracket cohomology contains only the physical degrees of freedom.
Linear Covariance Analysis for a Lunar Lander
NASA Technical Reports Server (NTRS)
Jang, Jiann-Woei; Bhatt, Sagar; Fritz, Matthew; Woffinden, David; May, Darryl; Braden, Ellen; Hannan, Michael
2017-01-01
A next-generation lunar lander Guidance, Navigation, and Control (GNC) system, which includes a state-of-the-art optical sensor suite, is proposed in a concept design cycle. The design goal is to allow the lander to softly land within the prescribed landing precision. The achievement of this precision landing requirement depends on proper selection of the sensor suite. In this paper, a robust sensor selection procedure is demonstrated using a Linear Covariance (LinCov) analysis tool developed by Draper.