Science.gov

Sample records for nonparametric covariate adjustment

  1. Nonparametric Covariate-Adjusted Association Tests Based on the Generalized Kendall’s Tau*

    PubMed Central

    Zhu, Wensheng; Jiang, Yuan; Zhang, Heping

    2012-01-01

    Identifying the risk factors for comorbidity is important in psychiatric research. Empirically, studies have shown that testing multiple, correlated traits simultaneously is more powerful than testing a single trait at a time in association analysis. Furthermore, for complex diseases, especially mental illnesses and behavioral disorders, the traits are often recorded in different scales such as dichotomous, ordinal and quantitative. In the absence of covariates, nonparametric association tests have been developed for multiple complex traits to study comorbidity. However, genetic studies generally contain measurements of some covariates that may affect the relationship between the risk factors of major interest (such as genes) and the outcomes. While it is relatively easy to adjust these covariates in a parametric model for quantitative traits, it is challenging for multiple complex traits with possibly different scales. In this article, we propose a nonparametric test for multiple complex traits that can adjust for covariate effects. The test aims to achieve an optimal scheme of adjustment by using a maximum statistic calculated from multiple adjusted test statistics. We derive the asymptotic null distribution of the maximum test statistic, and also propose a resampling approach, both of which can be used to assess the significance of our test. Simulations are conducted to compare the type I error and power of the nonparametric adjusted test to the unadjusted test and other existing adjusted tests. The empirical results suggest that our proposed test increases the power through adjustment for covariates when there exist environmental effects, and is more robust to model misspecifications than some existing parametric adjusted tests. We further demonstrate the advantage of our test by analyzing a data set on genetics of alcoholism. PMID:22745516

  2. Partial covariate adjusted regression

    PubMed Central

    Şentürk, Damla; Nguyen, Danh V.

    2008-01-01

    Covariate adjusted regression (CAR) is a recently proposed adjustment method for regression analysis where both the response and predictors are not directly observed (Şentürk and Müller, 2005). The available data has been distorted by unknown functions of an observable confounding covariate. CAR provides consistent estimators for the coefficients of the regression between the variables of interest, adjusted for the confounder. We develop a broader class of partial covariate adjusted regression (PCAR) models to accommodate both distorted and undistorted (adjusted/unadjusted) predictors. The PCAR model allows for unadjusted predictors, such as age, gender and demographic variables, which are common in the analysis of biomedical and epidemiological data. The available estimation and inference procedures for CAR are shown to be invalid for the proposed PCAR model. We propose new estimators and develop new inference tools for the more general PCAR setting. In particular, we establish the asymptotic normality of the proposed estimators and propose consistent estimators of their asymptotic variances. Finite sample properties of the proposed estimators are investigated using simulation studies and the method is also illustrated with a Pima Indians diabetes data set. PMID:20126296

  3. Effect on Prediction when Modeling Covariates in Bayesian Nonparametric Models.

    PubMed

    Cruz-Marcelo, Alejandro; Rosner, Gary L; Müller, Peter; Stewart, Clinton F

    2013-04-01

    In biomedical research, it is often of interest to characterize biologic processes giving rise to observations and to make predictions of future observations. Bayesian nonparametric methods provide a means for carrying out Bayesian inference making as few assumptions about restrictive parametric models as possible. There are several proposals in the literature for extending Bayesian nonparametric models to include dependence on covariates. Limited attention, however, has been directed to the following two aspects. In this article, we examine the effect on fitting and predictive performance of incorporating covariates in a class of Bayesian nonparametric models by one of two primary ways: either in the weights or in the locations of a discrete random probability measure. We show that different strategies for incorporating continuous covariates in Bayesian nonparametric models can result in big differences when used for prediction, even though they lead to otherwise similar posterior inferences. When one needs the predictive density, as in optimal design, and this density is a mixture, it is better to make the weights depend on the covariates. We demonstrate these points via a simulated data example and in an application in which one wants to determine the optimal dose of an anticancer drug used in pediatric oncology. PMID:23687472

  4. Covariate-Adjusted Linear Mixed Effects Model with an Application to Longitudinal Data

    PubMed Central

    Nguyen, Danh V.; Şentürk, Damla; Carroll, Raymond J.

    2009-01-01

    Linear mixed effects (LME) models are useful for longitudinal data/repeated measurements. We propose a new class of covariate-adjusted LME models for longitudinal data that nonparametrically adjusts for a normalizing covariate. The proposed approach involves fitting a parametric LME model to the data after adjusting for the nonparametric effects of a baseline confounding covariate. In particular, the effect of the observable covariate on the response and predictors of the LME model is modeled nonparametrically via smooth unknown functions. In addition to covariate-adjusted estimation of fixed/population parameters and random effects, an estimation procedure for the variance components is also developed. Numerical properties of the proposed estimators are investigated with simulation studies. The consistency and convergence rates of the proposed estimators are also established. An application to a longitudinal data set on calcium absorption, accounting for baseline distortion from body mass index, illustrates the proposed methodology. PMID:19266053

  5. A Simulation-Based Comparison of Covariate Adjustment Methods for the Analysis of Randomized Controlled Trials

    PubMed Central

    Chaussé, Pierre; Liu, Jin; Luta, George

    2016-01-01

    Covariate adjustment methods are frequently used when baseline covariate information is available for randomized controlled trials. Using a simulation study, we compared the analysis of covariance (ANCOVA) with three nonparametric covariate adjustment methods with respect to point and interval estimation for the difference between means. The three alternative methods were based on important members of the generalized empirical likelihood (GEL) family, specifically on the empirical likelihood (EL) method, the exponential tilting (ET) method, and the continuous updated estimator (CUE) method. Two criteria were considered for the comparison of the four statistical methods: the root mean squared error and the empirical coverage of the nominal 95% confidence intervals for the difference between means. Based on the results of the simulation study, for sensitivity analysis purposes, we recommend the use of ANCOVA (with robust standard errors when heteroscedasticity is present) together with the CUE-based covariate adjustment method. PMID:27077870

  6. A Review of Nonparametric Alternatives to Analysis of Covariance.

    ERIC Educational Resources Information Center

    Olejnik, Stephen F.; Algina, James

    1985-01-01

    Five distribution-free alternatives to parametric analysis of covariance are presented and demonstrated: Quade's distribution-free test, Puri and Sen's solution, McSweeney and Porter's rank transformation, Burnett and Barr's rank difference scores, and Shirley's general linear model solution. The results of simulation studies regarding Type I…

  7. A Review of Nonparametric Alternatives to Analysis of Covariance.

    ERIC Educational Resources Information Center

    Olejnik, Stephen F.; Algina, James

    Five distribution-free alternatives to parametric analysis of covariance (ANCOVA) are presented and demonstrated using a specific data example. The procedures considered are those suggested by Quade (1967); Puri and Sen (1969); McSweeney and Porter (1971); Burnett and Barr (1978); and Shirley (1981). The results of simulation studies investigating…

  8. An Investigation into the Dimensionality of TOEFL Using Conditional Covariance-Based Nonparametric Approach

    ERIC Educational Resources Information Center

    Jang, Eunice Eunhee; Roussos, Louis

    2007-01-01

    This article reports two studies to illustrate methodologies for conducting a conditional covariance-based nonparametric dimensionality assessment using data from two forms of the Test of English as a Foreign Language (TOEFL). Study 1 illustrates how to assess overall dimensionality of the TOEFL including all three subtests. Study 2 is aimed at…

  9. Covariate Adjusted Correlation Analysis with Application to FMR1 Premutation Female Carrier Data

    PubMed Central

    Şentürk, Damla; Nguyen, Danh V.; Tassone, Flora; Hagerman, Randi J.; Carroll, Raymond J.; Hagerman, Paul J.

    2009-01-01

    Summary Motivated by molecular data on female premutation carriers of the fragile X mental retardation 1 (FMR1) gene, we present a new method of covariate adjusted correlation analysis to examine the association of messenger RNA (mRNA) and number of CGG repeat expansion in the FMR1 gene. The association between the molecular variables in female carriers needs to adjust for activation ratio (ActRatio), a measure which accounts for the protective effects of one normal X chromosome in females carriers. However, there are inherent uncertainties in the exact effects of ActRatio on the molecular measures of interest. To account for these uncertainties, we develop a flexible adjustment that accommodates both additive and multiplicative effects of ActRatio nonparametrically. The proposed adjusted correlation uses local conditional correlations, which are local method of moments estimators, to estimate the Pearson correlation between two variables adjusted for a third observable covariate. The local method of moments estimators are averaged to arrive at the final covariate adjusted correlation estimator, which is shown to be consistent. We also develop a test to check the nonparametric joint additive and multiplicative adjustment form. Simulation studies illustrate the efficacy of the proposed method. (Application to FMR1 premutation data on 165 female carriers indicates that the association between mRNA and CGG repeat after adjusting for ActRatio is stronger.) Finally, the results provide independent support for a specific jointly additive and multiplicative adjustment form for ActRatio previously proposed in the literature. PMID:19173699

  10. The covariate-adjusted frequency plot.

    PubMed

    Holling, Heinz; Böhning, Walailuck; Böhning, Dankmar; Formann, Anton K

    2016-04-01

    Count data arise in numerous fields of interest. Analysis of these data frequently require distributional assumptions. Although the graphical display of a fitted model is straightforward in the univariate scenario, this becomes more complex if covariate information needs to be included into the model. Stratification is one way to proceed, but has its limitations if the covariate has many levels or the number of covariates is large. The article suggests a marginal method which works even in the case that all possible covariate combinations are different (i.e. no covariate combination occurs more than once). For each covariate combination the fitted model value is computed and then summed over the entire data set. The technique is quite general and works with all count distributional models as well as with all forms of covariate modelling. The article provides illustrations of the method for various situations and also shows that the proposed estimator as well as the empirical count frequency are consistent with respect to the same parameter. PMID:23376964

  11. On variance estimate for covariate adjustment by propensity score analysis.

    PubMed

    Zou, Baiming; Zou, Fei; Shuster, Jonathan J; Tighe, Patrick J; Koch, Gary G; Zhou, Haibo

    2016-09-10

    Propensity score (PS) methods have been used extensively to adjust for confounding factors in the statistical analysis of observational data in comparative effectiveness research. There are four major PS-based adjustment approaches: PS matching, PS stratification, covariate adjustment by PS, and PS-based inverse probability weighting. Though covariate adjustment by PS is one of the most frequently used PS-based methods in clinical research, the conventional variance estimation of the treatment effects estimate under covariate adjustment by PS is biased. As Stampf et al. have shown, this bias in variance estimation is likely to lead to invalid statistical inference and could result in erroneous public health conclusions (e.g., food and drug safety and adverse events surveillance). To address this issue, we propose a two-stage analytic procedure to develop a valid variance estimator for the covariate adjustment by PS analysis strategy. We also carry out a simple empirical bootstrap resampling scheme. Both proposed procedures are implemented in an R function for public use. Extensive simulation results demonstrate the bias in the conventional variance estimator and show that both proposed variance estimators offer valid estimates for the true variance, and they are robust to complex confounding structures. The proposed methods are illustrated for a post-surgery pain study. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26999553

  12. Inference on treatment-covariate interaction based on a nonparametric measure of treatment effects and censored survival data.

    PubMed

    Jiang, Shan; Chen, Bingshu; Tu, Dongshengn

    2016-07-20

    The investigation of the treatment-covariate interaction is of considerable interest in the design and analysis of clinical trials. With potentially censored data observed, non-parametric and semi-parametric estimates and associated confidence intervals are proposed in this paper to quantify the interactions between the treatment and a binary covariate. In addition, comparison of interactions between the treatment and two covariates are also considered. The proposed approaches are evaluated and compared by Monte Carlo simulations and applied to a real data set from a cancer clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26887976

  13. Role of Experiment Covariance in Cross Section Adjustments

    SciTech Connect

    Giuseppe Palmiotti; M. Salvatores

    2014-06-01

    This paper is dedicated to the memory of R. D. McKnight, which gave a seminal contribution in establishing methodology and rigorous approach in the evaluation of the covariance of reactor physics integral experiments. His original assessment of the ZPPR experiment uncertainties and correlations has made nuclear data adjustments, based on these experiments, much more robust and reliable. In the present paper it has been shown with some numerical examples the actual impact on an adjustment of accounting for or neglecting such correlations.

  14. Sample Size for Confidence Interval of Covariate-Adjusted Mean Difference

    ERIC Educational Resources Information Center

    Liu, Xiaofeng Steven

    2010-01-01

    This article provides a way to determine adequate sample size for the confidence interval of covariate-adjusted mean difference in randomized experiments. The standard error of adjusted mean difference depends on covariate variance and balance, which are two unknown quantities at the stage of planning sample size. If covariate observations are…

  15. Nonparametric Estimation of Standard Errors in Covariance Analysis Using the Infinitesimal Jackknife

    ERIC Educational Resources Information Center

    Jennrich, Robert I.

    2008-01-01

    The infinitesimal jackknife provides a simple general method for estimating standard errors in covariance structure analysis. Beyond its simplicity and generality what makes the infinitesimal jackknife method attractive is that essentially no assumptions are required to produce consistent standard error estimates, not even the requirement that the…

  16. Development and Validation of a Brief Version of the Dyadic Adjustment Scale With a Nonparametric Item Analysis Model

    ERIC Educational Resources Information Center

    Sabourin, Stephane; Valois, Pierre; Lussier, Yvan

    2005-01-01

    The main purpose of the current research was to develop an abbreviated form of the Dyadic Adjustment Scale (DAS) with nonparametric item response theory. The authors conducted 5 studies, with a total participation of 8,256 married or cohabiting individuals. Results showed that the item characteristic curves behaved in a monotonically increasing…

  17. Covariate-adjusted confidence interval for the intraclass correlation coefficient.

    PubMed

    Shoukri, Mohamed M; Donner, Allan; El-Dali, Abdelmoneim

    2013-09-01

    A crucial step in designing a new study is to estimate the required sample size. For a design involving cluster sampling, the appropriate sample size depends on the so-called design effect, which is a function of the average cluster size and the intracluster correlation coefficient (ICC). It is well-known that under the framework of hierarchical and generalized linear models, a reduction in residual error may be achieved by including risk factors as covariates. In this paper we show that the covariate design, indicating whether the covariates are measured at the cluster level or at the within-cluster subject level affects the estimation of the ICC, and hence the design effect. Therefore, the distinction between these two types of covariates should be made at the design stage. In this paper we use the nested-bootstrap method to assess the accuracy of the estimated ICC for continuous and binary response variables under different covariate structures. The codes of two SAS macros are made available by the authors for interested readers to facilitate the construction of confidence intervals for the ICC. Moreover, using Monte Carlo simulations we evaluate the relative efficiency of the estimators and evaluate the accuracy of the coverage probabilities of a 95% confidence interval on the population ICC. The methodology is illustrated using a published data set of blood pressure measurements taken on family members. PMID:23871746

  18. Validity of a Residualized Dependent Variable after Pretest Covariance Adjustments: Still the Same Variable?

    ERIC Educational Resources Information Center

    Nimon, Kim; Henson, Robin K.

    2015-01-01

    The authors empirically examined whether the validity of a residualized dependent variable after covariance adjustment is comparable to that of the original variable of interest. When variance of a dependent variable is removed as a result of one or more covariates, the residual variance may not reflect the same meaning. Using the pretest-posttest…

  19. Covariate Adjustment Strategy Increases Power in the Randomized Controlled Trial With Discrete-Time Survival Endpoints

    ERIC Educational Resources Information Center

    Safarkhani, Maryam; Moerbeek, Mirjam

    2013-01-01

    In a randomized controlled trial, a decision needs to be made about the total number of subjects for adequate statistical power. One way to increase the power of a trial is by including a predictive covariate in the model. In this article, the effects of various covariate adjustment strategies on increasing the power is studied for discrete-time…

  20. Adjusting for matching and covariates in linear discriminant analysis

    PubMed Central

    Asafu-Adjei, Josephine K.; Sampson, Allan R.; Sweet, Robert A.; Lewis, David A.

    2013-01-01

    In studies that compare several diagnostic or treatment groups, subjects may not only be measured on a certain set of feature variables, but also be matched on a number of demographic characteristics and measured on additional covariates. Linear discriminant analysis (LDA) is sometimes used to identify which feature variables best discriminate among groups, while accounting for the dependencies among the feature variables. We present a new approach to LDA for multivariate normal data that accounts for the subject matching used in a particular study design, as well as covariates not used in the matching. Applications are given for post-mortem tissue data with the aim of comparing neurobiological characteristics of subjects with schizophrenia with those of normal controls, and for a post-mortem tissue primate study comparing brain biomarker measurements across three treatment groups. We also investigate the performance of our approach using a simulation study. PMID:23640791

  1. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study

    PubMed Central

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Background. Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients. PMID:26413142

  2. Slope Estimation of Covariates that Influence Renal Outcome following Renal Transplant Adjusting for Informative Right Censoring

    PubMed Central

    Jaffa, Miran A.; Jaffa, Ayad A; Lipsitz, Stuart R.

    2015-01-01

    A new statistical model is proposed to estimate population and individual slopes that are adjusted for covariates and informative right censoring. Individual slopes are assumed to have a mean that depends on the population slope for the covariates. The number of observations for each individual is modeled as a truncated discrete distribution with mean dependent on the individual subjects' slopes. Our simulation study results indicated that the associated bias and mean squared errors for the proposed model were comparable to those associated with the model that only adjusts for informative right censoring. The proposed model was illustrated using renal transplant dataset to estimate population slopes for covariates that could impact the outcome of renal function following renal transplantation. PMID:25729124

  3. On the Importance of Reliable Covariate Measurement in Selection Bias Adjustments Using Propensity Scores

    ERIC Educational Resources Information Center

    Steiner, Peter M.; Cook, Thomas D.; Shadish, William R.

    2011-01-01

    The effect of unreliability of measurement on propensity score (PS) adjusted treatment effects has not been previously studied. The authors report on a study simulating different degrees of unreliability in the multiple covariates that were used to estimate the PS. The simulation uses the same data as two prior studies. Shadish, Clark, and Steiner…

  4. Taking correlations in GPS least squares adjustments into account with a diagonal covariance matrix

    NASA Astrophysics Data System (ADS)

    Kermarrec, Gaël; Schön, Steffen

    2016-05-01

    Based on the results of Luati and Proietti (Ann Inst Stat Math 63:673-686, 2011) on an equivalence for a certain class of polynomial regressions between the diagonally weighted least squares (DWLS) and the generalized least squares (GLS) estimator, an alternative way to take correlations into account thanks to a diagonal covariance matrix is presented. The equivalent covariance matrix is much easier to compute than a diagonalization of the covariance matrix via eigenvalue decomposition which also implies a change of the least squares equations. This condensed matrix, for use in the least squares adjustment, can be seen as a diagonal or reduced version of the original matrix, its elements being simply the sums of the rows elements of the weighting matrix. The least squares results obtained with the equivalent diagonal matrices and those given by the fully populated covariance matrix are mathematically strictly equivalent for the mean estimator in terms of estimate and its a priori cofactor matrix. It is shown that this equivalence can be empirically extended to further classes of design matrices such as those used in GPS positioning (single point positioning, precise point positioning or relative positioning with double differences). Applying this new model to simulated time series of correlated observations, a significant reduction of the coordinate differences compared with the solutions computed with the commonly used diagonal elevation-dependent model was reached for the GPS relative positioning with double differences, single point positioning as well as precise point positioning cases. The estimate differences between the equivalent and classical model with fully populated covariance matrix were below the mm for all simulated GPS cases and below the sub-mm for the relative positioning with double differences. These results were confirmed by analyzing real data. Consequently, the equivalent diagonal covariance matrices, compared with the often used elevation

  5. A consistent local linear estimator of the covariate adjusted correlation coefficient

    PubMed Central

    Nguyen, Danh V.; Şentürk, Damla

    2009-01-01

    Consider the correlation between two random variables (X, Y), both not directly observed. One only observes X̃ = φ1(U)X + φ2(U) and Ỹ = ψ1(U)Y + ψ2(U), where all four functions {φl(·),ψl(·), l = 1, 2} are unknown/unspecified smooth functions of an observable covariate U. We consider consistent estimation of the correlation between the unobserved variables X and Y, adjusted for the above general dual additive and multiplicative effects of U, based on the observed data (X̃, Ỹ, U). PMID:21720454

  6. Covariate-adjusted response-adaptive designs for longitudinal treatment responses: PEMF trial revisited.

    PubMed

    Biswas, Atanu; Park, Eunsik; Bhattacharya, Rahul

    2012-08-01

    Response-adaptive designs have become popular for allocation of the entering patients among two or more competing treatments in a phase III clinical trial. Although there are a lot of designs for binary treatment responses, the number of designs involving covariates is very small. Sometimes the patients give repeated responses. The only available response-adaptive allocation design for repeated binary responses is the urn design by Biswas and Dewanji [Biswas A and Dewanji AA. Randomized longitudinal play-the-winner design for repeated binary data. ANZJS 2004; 46: 675-684; Biswas A and Dewanji A. Inference for a RPW-type clinical trial with repeated monitoring for the treatment of rheumatoid arthritis. Biometr J 2004; 46: 769-779.], although it does not take care of the covariates of the patients in the allocation design. In this article, a covariate-adjusted response-adaptive randomisation procedure is developed using the log-odds ratio within the Bayesian framework for longitudinal binary responses. The small sample performance of the proposed allocation procedure is assessed through a simulation study. The proposed procedure is illustrated using some real data set. PMID:20974667

  7. Comparison of covariate adjustment methods using space-time scan statistics for food animal syndromic surveillance

    PubMed Central

    2013-01-01

    Background Abattoir condemnation data show promise as a rich source of data for syndromic surveillance of both animal and zoonotic diseases. However, inherent characteristics of abattoir condemnation data can bias results from space-time cluster detection methods for disease surveillance, and may need to be accounted for using various adjustment methods. The objective of this study was to compare the space-time scan statistics with different abilities to control for covariates and to assess their suitability for food animal syndromic surveillance. Four space-time scan statistic models were used including: animal class adjusted Poisson, space-time permutation, multi-level model adjusted Poisson, and a weighted normal scan statistic using model residuals. The scan statistics were applied to monthly bovine pneumonic lung and “parasitic liver” condemnation data from Ontario provincial abattoirs from 2001–2007. Results The number and space-time characteristics of identified clusters often varied between space-time scan tests for both “parasitic liver” and pneumonic lung condemnation data. While there were some similarities between isolated clusters in space, time and/or space-time, overall the results from space-time scan statistics differed substantially depending on the covariate adjustment approach used. Conclusions Variability in results among methods suggests that caution should be used in selecting space-time scan methods for abattoir surveillance. Furthermore, validation of different approaches with simulated or real outbreaks is required before conclusive decisions can be made concerning the best approach for conducting surveillance with these data. PMID:24246040

  8. A nonparametric stochastic method for generating daily climate-adjusted streamflows

    NASA Astrophysics Data System (ADS)

    Stagge, J. H.; Moglen, G. E.

    2013-10-01

    A daily stochastic streamflow generation model is presented, which successfully replicates statistics of the historical streamflow record and can produce climate-adjusted daily time series. A monthly climate model relates general circulation model (GCM)-scale climate indicators to discrete climate-streamflow states, which in turn control parameters in a daily streamflow generation model. Daily flow is generated by a two-state (increasing/decreasing) Markov chain, with rising limb increments randomly sampled from a Weibull distribution and the falling limb modeled as exponential recession. When applied to the Potomac River, a 38,000 km2 basin in the Mid-Atlantic United States, the model reproduces the daily, monthly, and annual distribution and dynamics of the historical streamflow record, including extreme low flows. This method can be used as part of water resources planning, vulnerability, and adaptation studies and offers the advantage of a parsimonious model, requiring only a sufficiently long historical streamflow record and large-scale climate data. Simulation of Potomac streamflows subject to the Special Report on Emissions Scenarios (SRES) A1b, A2, and B1 emission scenarios predict a slight increase in mean annual flows over the next century, with the majority of this increase occurring during the winter and early spring. Conversely, mean summer flows are projected to decrease due to climate change, caused by a shift to shorter, more sporadic rain events. Date of the minimum annual flow is projected to shift 2-5 days earlier by the 2070-2099 period.

  9. Directional Variance Adjustment: Bias Reduction in Covariance Matrices Based on Factor Analysis with an Application to Portfolio Optimization

    PubMed Central

    Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W.; Müller, Klaus-Robert; Lemm, Steven

    2013-01-01

    Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016

  10. Directional variance adjustment: bias reduction in covariance matrices based on factor analysis with an application to portfolio optimization.

    PubMed

    Bartz, Daniel; Hatrick, Kerr; Hesse, Christian W; Müller, Klaus-Robert; Lemm, Steven

    2013-01-01

    Robust and reliable covariance estimates play a decisive role in financial and many other applications. An important class of estimators is based on factor models. Here, we show by extensive Monte Carlo simulations that covariance matrices derived from the statistical Factor Analysis model exhibit a systematic error, which is similar to the well-known systematic error of the spectrum of the sample covariance matrix. Moreover, we introduce the Directional Variance Adjustment (DVA) algorithm, which diminishes the systematic error. In a thorough empirical study for the US, European, and Hong Kong stock market we show that our proposed method leads to improved portfolio allocation. PMID:23844016

  11. A Simulation Study on the Performance of the Simple Difference and Covariance-Adjusted Scores in Randomized Experimental Designs

    ERIC Educational Resources Information Center

    Petscher, Yaacov; Schatschneider, Christopher

    2011-01-01

    Research by Huck and McLean (1975) demonstrated that the covariance-adjusted score is more powerful than the simple difference score, yet recent reviews indicate researchers are equally likely to use either score type in two-wave randomized experimental designs. A Monte Carlo simulation was conducted to examine the conditions under which the…

  12. Use and Impact of Covariance Data in the Japanese Latest Adjusted Library ADJ2010 Based on JENDL-4.0

    SciTech Connect

    Yokoyama, K. Ishikawa, M.

    2015-01-15

    The current status of covariance applications to fast reactor analysis and design in Japan is summarized. In Japan, the covariance data are mainly used for three purposes: (1) to quantify the uncertainty of nuclear core parameters, (2) to identify important nuclides, reactions and energy ranges which are dominant to the uncertainty of core parameters, and (3) to improve the accuracy of core design values by adopting the integral data such as the critical experiments and the power reactor operation data. For the last purpose, the cross section adjustment based on the Bayesian theorem is used. After the release of JENDL-4.0, a development project of the new adjusted group-constant set ADJ2010 was started in 2010 and completed in 2013. In the present paper, the final results of ADJ2010 are briefly summarized. In addition, the adjustment results of ADJ2010 are discussed from the viewpoint of use and impact of nuclear data covariances, focusing on {sup 239}Pu capture cross section alterations. For this purpose three kind of indices, called “degree of mobility,” “adjustment motive force,” and “adjustment potential,” are proposed.

  13. A-Priori and A-Posteriori Covariance Data in Nuclear Cross Section Adjustments: Issues and Challenges

    SciTech Connect

    Palmiotti, Giuseppe; Salvatores, Massimo; Aliberti, G.

    2015-01-01

    In order to provide useful feedback to evaluators a set of criteria are established for assessing the robustness and reliability of the cross section adjustments that make use of integral experiment information. Criteria are also provided for accepting the “a posteriori” cross sections, both as new “nominal” values and as “trends”. Some indications of the use of the “a posteriori” covariance matrix are indicated, even though more investigation is needed to settle this complex subject.

  14. Effects of Participation in a Post-Secondary Honors Program with Covariate Adjustment Using Propensity Score

    ERIC Educational Resources Information Center

    Furtwengler, Scott R.

    2015-01-01

    The present study sought to determine the extent to which participation in a post-secondary honors program affected academic achievement. Archival data were collected on three cohorts of high-achieving students at a large public university. Propensity scores were calculated on factors predicting participation in honors and used as the covariate.…

  15. Adjusting head circumference for covariates in autism: clinical correlates of a highly heritable continuous trait

    PubMed Central

    Chaste, Pauline; Klei, Lambertus; Sanders, Stephan J.; Murtha, Michael T.; Hus, Vanessa; Lowe, Jennifer K.; Willsey, A. Jeremy; Moreno-De-Luca, Daniel; Yu, Timothy W.; Fombonne, Eric; Geschwind, Daniel; Grice, Dorothy E.; Ledbetter, David H.; Lord, Catherine; Mane, Shrikant M.; Martin, Christa Lese; Martin, Donna M.; Morrow, Eric M.; Walsh, Christopher A.; Sutcliffe, James S.; State, Matthew W.; Devlin, Bernie; Cook, Edwin H.; Kim, Soo-Jeong

    2013-01-01

    BACKGROUND Brain development follows a different trajectory in children with Autism Spectrum Disorders (ASD) than in typically developing children. A proxy for neurodevelopment could be head circumference (HC), but studies assessing HC and its clinical correlates in ASD have been inconsistent. This study investigates HC and clinical correlates in the Simons Simplex Collection cohort. METHODS We used a mixed linear model to estimate effects of covariates and the deviation from the expected HC given parental HC (genetic deviation). After excluding individuals with incomplete data, 7225 individuals in 1891 families remained for analysis. We examined the relationship between HC/genetic deviation of HC and clinical parameters. RESULTS Gender, age, height, weight, genetic ancestry and ASD status were significant predictors of HC (estimate of the ASD effect=0.2cm). HC was approximately normally distributed in probands and unaffected relatives, with only a few outliers. Genetic deviation of HC was also normally distributed, consistent with a random sampling of parental genes. Whereas larger HC than expected was associated with ASD symptom severity and regression, IQ decreased with the absolute value of the genetic deviation of HC. CONCLUSIONS Measured against expected values derived from covariates of ASD subjects, statistical outliers for HC were uncommon. HC is a strongly heritable trait and population norms for HC would be far more accurate if covariates including genetic ancestry, height and age were taken into account. The association of diminishing IQ with absolute deviation from predicted HC values suggests HC could reflect subtle underlying brain development and warrants further investigation. PMID:23746936

  16. Covariate adjustment of cumulative incidence functions for competing risks data using inverse probability of treatment weighting.

    PubMed

    Neumann, Anke; Billionnet, Cécile

    2016-06-01

    In observational studies without random assignment of the treatment, the unadjusted comparison between treatment groups may be misleading due to confounding. One method to adjust for measured confounders is inverse probability of treatment weighting. This method can also be used in the analysis of time to event data with competing risks. Competing risks arise if for some individuals the event of interest is precluded by a different type of event occurring before, or if only the earliest of several times to event, corresponding to different event types, is observed or is of interest. In the presence of competing risks, time to event data are often characterized by cumulative incidence functions, one for each event type of interest. We describe the use of inverse probability of treatment weighting to create adjusted cumulative incidence functions. This method is equivalent to direct standardization when the weight model is saturated. No assumptions about the form of the cumulative incidence functions are required. The method allows studying associations between treatment and the different types of event under study, while focusing on the earliest event only. We present a SAS macro implementing this method and we provide a worked example. PMID:27084321

  17. Sequential BART for imputation of missing covariates.

    PubMed

    Xu, Dandan; Daniels, Michael J; Winterstein, Almut G

    2016-07-01

    To conduct comparative effectiveness research using electronic health records (EHR), many covariates are typically needed to adjust for selection and confounding biases. Unfortunately, it is typical to have missingness in these covariates. Just using cases with complete covariates will result in considerable efficiency losses and likely bias. Here, we consider the covariates missing at random with missing data mechanism either depending on the response or not. Standard methods for multiple imputation can either fail to capture nonlinear relationships or suffer from the incompatibility and uncongeniality issues. We explore a flexible Bayesian nonparametric approach to impute the missing covariates, which involves factoring the joint distribution of the covariates with missingness into a set of sequential conditionals and applying Bayesian additive regression trees to model each of these univariate conditionals. Using data augmentation, the posterior for each conditional can be sampled simultaneously. We provide details on the computational algorithm and make comparisons to other methods, including parametric sequential imputation and two versions of multiple imputation by chained equations. We illustrate the proposed approach on EHR data from an affiliated tertiary care institution to examine factors related to hyperglycemia. PMID:26980459

  18. Nonparametric identification experiment

    NASA Technical Reports Server (NTRS)

    Yam, Yeung

    1988-01-01

    The following constitutes a summary of this paper: on-orbit identification methodology starts with nonparametric techniques for a priori system identification; development of the nonparametric identification and model determination experiment software has been completed; the validation experiments to be performed on the JPL Control and Identification Technology Validation Laboratory have been designed.

  19. Nonparametric Methods in Reliability

    PubMed Central

    Hollander, Myles; Peña, Edsel A.

    2005-01-01

    Probabilistic and statistical models for the occurrence of a recurrent event over time are described. These models have applicability in the reliability, engineering, biomedical and other areas where a series of events occurs for an experimental unit as time progresses. Nonparametric inference methods, in particular, the estimation of a relevant distribution function, are described. PMID:16710444

  20. Covariant Transform

    NASA Astrophysics Data System (ADS)

    Kisil, Vladimir V.

    2011-03-01

    Dedicated to the memory of Cora Sadosky The paper develops theory of covariant transform, which is inspired by the wavelet construction. It was observed that many interesting types of wavelets (or coherent states) arise from group representations which are not square integrable or vacuum vectors which are not admissible. Covariant transform extends an applicability of the popular wavelets construction to classic examples like the Hardy space H2, Banach spaces, covariant functional calculus and many others.

  1. Nonparametric conditional estimation

    SciTech Connect

    Owen, A.B.

    1987-01-01

    Many nonparametric regression techniques (such as kernels, nearest neighbors, and smoothing splines) estimate the conditional mean of Y given X = chi by a weighted sum of observed Y values, where observations with X values near chi tend to have larger weights. In this report the weights are taken to represent a finite signed measure on the space of Y values. This measure is studied as an estimate of the conditional distribution of Y given X = chi. From estimates of the conditional distribution, estimates of conditional means, standard deviations, quantiles and other statistical functionals may be computed. Chapter 1 illustrates the computation of conditional quantiles and conditional survival probabilities on the Stanford Heart Transplant data. Chapter 2 contains a survey of nonparametric regression methods and introduces statistical metrics and von Mises' method for later use. Chapter 3 proves some consistency results. Chapter 4 provides conditions under which the suitably normalized errors in estimating the conditional distribution of Y have a Brownian limit. Using von Mises' method, asymptotic normality is obtained for nonparametric conditional estimates of compactly differentiable statistical functionals.

  2. Identifying Genetic Variants for Addiction via Propensity Score Adjusted Generalized Kendall's Tau.

    PubMed

    Jiang, Yuan; Li, Ni; Zhang, Heping

    2014-01-01

    Identifying replicable genetic variants for addiction has been extremely challenging. Besides the common difficulties with genome-wide association studies (GWAS), environmental factors are known to be critical to addiction, and comorbidity is widely observed. Despite the importance of environmental factors and comorbidity for addiction study, few GWAS analyses adequately considered them due to the limitations of the existing statistical methods. Although parametric methods have been developed to adjust for covariates in association analysis, difficulties arise when the traits are multivariate because there is no ready-to-use model for them. Recent nonparametric development includes U-statistics to measure the phenotype-genotype association weighted by a similarity score of covariates. However, it is not clear how to optimize the similarity score. Therefore, we propose a semiparametric method to measure the association adjusted by covariates. In our approach, the nonparametric U-statistic is adjusted by parametric estimates of propensity scores using the idea of inverse probability weighting. The new measurement is shown to be asymptotically unbiased under our null hypothesis while the previous non-weighted and weighted ones are not. Simulation results show that our test improves power as opposed to the non-weighted and two other weighted U-statistic methods, and it is particularly powerful for detecting gene-environment interactions. Finally, we apply our proposed test to the Study of Addiction: Genetics and Environment (SAGE) to identify genetic variants for addiction. Novel genetic variants are found from our analysis, which warrant further investigation in the future. PMID:25382885

  3. Identifying Genetic Variants for Addiction via Propensity Score Adjusted Generalized Kendall’s Tau

    PubMed Central

    Jiang, Yuan; Li, Ni; Zhang, Heping

    2014-01-01

    Identifying replicable genetic variants for addiction has been extremely challenging. Besides the common difficulties with genome-wide association studies (GWAS), environmental factors are known to be critical to addiction, and comorbidity is widely observed. Despite the importance of environmental factors and comorbidity for addiction study, few GWAS analyses adequately considered them due to the limitations of the existing statistical methods. Although parametric methods have been developed to adjust for covariates in association analysis, difficulties arise when the traits are multivariate because there is no ready-to-use model for them. Recent nonparametric development includes U-statistics to measure the phenotype-genotype association weighted by a similarity score of covariates. However, it is not clear how to optimize the similarity score. Therefore, we propose a semiparametric method to measure the association adjusted by covariates. In our approach, the nonparametric U-statistic is adjusted by parametric estimates of propensity scores using the idea of inverse probability weighting. The new measurement is shown to be asymptotically unbiased under our null hypothesis while the previous non-weighted and weighted ones are not. Simulation results show that our test improves power as opposed to the non-weighted and two other weighted U-statistic methods, and it is particularly powerful for detecting gene-environment interactions. Finally, we apply our proposed test to the Study of Addiction: Genetics and Environment (SAGE) to identify genetic variants for addiction. Novel genetic variants are found from our analysis, which warrant further investigation in the future. PMID:25382885

  4. Effect modification by time-varying covariates.

    PubMed

    Robins, James M; Hernán, Miguel A; Rotnitzky, Andrea

    2007-11-01

    Marginal structural models (MSMs) allow estimation of effect modification by baseline covariates, but they are less useful for estimating effect modification by evolving time-varying covariates. Rather, structural nested models (SNMs) were specifically designed to estimate effect modification by time-varying covariates. In their paper, Petersen et al. (Am J Epidemiol 2007;166:985-993) describe history-adjusted MSMs as a generalized form of MSM and argue that history-adjusted MSMs allow a researcher to easily estimate effect modification by time-varying covariates. However, history-adjusted MSMs can result in logically incompatible parameter estimates and hence in contradictory substantive conclusions. Here the authors propose a more restrictive definition of history-adjusted MSMs than the one provided by Petersen et al. and compare the advantages and disadvantages of using history-adjusted MSMs, as opposed to SNMs, to examine effect modification by time-dependent covariates. PMID:17875581

  5. Nonparametric Bayes analysis of social science data

    NASA Astrophysics Data System (ADS)

    Kunihama, Tsuyoshi

    Social science data often contain complex characteristics that standard statistical methods fail to capture. Social surveys assign many questions to respondents, which often consist of mixed-scale variables. Each of the variables can follow a complex distribution outside parametric families and associations among variables may have more complicated structures than standard linear dependence. Therefore, it is not straightforward to develop a statistical model which can approximate structures well in the social science data. In addition, many social surveys have collected data over time and therefore we need to incorporate dynamic dependence into the models. Also, it is standard to observe massive number of missing values in the social science data. To address these challenging problems, this thesis develops flexible nonparametric Bayesian methods for the analysis of social science data. Chapter 1 briefly explains backgrounds and motivations of the projects in the following chapters. Chapter 2 develops a nonparametric Bayesian modeling of temporal dependence in large sparse contingency tables, relying on a probabilistic factorization of the joint pmf. Chapter 3 proposes nonparametric Bayes inference on conditional independence with conditional mutual information used as a measure of the strength of conditional dependence. Chapter 4 proposes a novel Bayesian density estimation method in social surveys with complex designs where there is a gap between sample and population. We correct for the bias by adjusting mixture weights in Bayesian mixture models. Chapter 5 develops a nonparametric model for mixed-scale longitudinal surveys, in which various types of variables can be induced through latent continuous variables and dynamic latent factors lead to flexibly time-varying associations among variables.

  6. Bayesian Nonparametric Models for Multiway Data Analysis.

    PubMed

    Xu, Zenglin; Yan, Feng; Qi, Yuan

    2015-02-01

    Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches-such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)-amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g., missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models for multiway data analysis. We name these models InfTucker. These new models essentially conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or t processes with nonlinear covariance functions. Moreover, on network data, our models reduce to nonparametric stochastic blockmodels and can be used to discover latent groups and predict missing interactions. To learn the models efficiently from data, we develop a variational inference technique and explore properties of the Kronecker product for computational efficiency. Compared with a classical variational implementation, this technique reduces both time and space complexities by several orders of magnitude. On real multiway and network data, our new models achieved significantly higher prediction accuracy than state-of-art tensor decomposition methods and blockmodels. PMID:26353255

  7. Marginally specified priors for non-parametric Bayesian estimation

    PubMed Central

    Kessler, David C.; Hoff, Peter D.; Dunson, David B.

    2014-01-01

    Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813

  8. Addiction Severity Index Recent and Lifetime Summary Indexes Based on Nonparametric Item Response Theory Methods

    ERIC Educational Resources Information Center

    Alterman, Arthur I.; Cacciola, John S.; Habing, Brian; Lynch, Kevin G.

    2007-01-01

    Baseline Addiction Severity Index (5th ed.; ASI-5) data of 2,142 substance abuse patients were analyzed with two nonparametric item response theory (NIRT) methods: Mokken scaling and conditional covariance techniques. Nine reliable and dimensionally homogeneous Recent Problem indexes emerged in the ASI-5's seven areas, including two each in the…

  9. Conditional Covariance-based Representation of Multidimensional Test Structure.

    ERIC Educational Resources Information Center

    Bolt, Daniel M.

    2001-01-01

    Presents a new nonparametric method for constructing a spatial representation of multidimensional test structure, the Conditional Covariance-based SCALing (CCSCAL) method. Describes an index to measure the accuracy of the representation. Uses simulation and real-life data analyses to show that the method provides a suitable approximation to…

  10. Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models.

    PubMed

    Fan, Jianqing; Ma, Yunbei; Dai, Wei

    2014-01-01

    The varying-coefficient model is an important class of nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is large, the issue of variable selection arises. In this paper, we propose and investigate marginal nonparametric screening methods to screen variables in sparse ultra-high dimensional varying-coefficient models. The proposed nonparametric independence screening (NIS) selects variables by ranking a measure of the nonparametric marginal contributions of each covariate given the exposure variable. The sure independent screening property is established under some mild technical conditions when the dimensionality is of nonpolynomial order, and the dimensionality reduction of NIS is quantified. To enhance the practical utility and finite sample performance, two data-driven iterative NIS methods are proposed for selecting thresholding parameters and variables: conditional permutation and greedy methods, resulting in Conditional-INIS and Greedy-INIS. The effectiveness and flexibility of the proposed methods are further illustrated by simulation studies and real data applications. PMID:25309009

  11. Nonparametric Independence Screening in Sparse Ultra-High Dimensional Varying Coefficient Models

    PubMed Central

    Fan, Jianqing; Ma, Yunbei; Dai, Wei

    2014-01-01

    The varying-coefficient model is an important class of nonparametric statistical model that allows us to examine how the effects of covariates vary with exposure variables. When the number of covariates is large, the issue of variable selection arises. In this paper, we propose and investigate marginal nonparametric screening methods to screen variables in sparse ultra-high dimensional varying-coefficient models. The proposed nonparametric independence screening (NIS) selects variables by ranking a measure of the nonparametric marginal contributions of each covariate given the exposure variable. The sure independent screening property is established under some mild technical conditions when the dimensionality is of nonpolynomial order, and the dimensionality reduction of NIS is quantified. To enhance the practical utility and finite sample performance, two data-driven iterative NIS methods are proposed for selecting thresholding parameters and variables: conditional permutation and greedy methods, resulting in Conditional-INIS and Greedy-INIS. The effectiveness and flexibility of the proposed methods are further illustrated by simulation studies and real data applications. PMID:25309009

  12. Bayesian non-parametrics and the probabilistic approach to modelling

    PubMed Central

    Ghahramani, Zoubin

    2013-01-01

    Modelling is fundamental to many fields of science and engineering. A model can be thought of as a representation of possible data one could predict from a system. The probabilistic approach to modelling uses probability theory to express all aspects of uncertainty in the model. The probabilistic approach is synonymous with Bayesian modelling, which simply uses the rules of probability theory in order to make predictions, compare alternative models, and learn model parameters and structure from data. This simple and elegant framework is most powerful when coupled with flexible probabilistic models. Flexibility is achieved through the use of Bayesian non-parametrics. This article provides an overview of probabilistic modelling and an accessible survey of some of the main tools in Bayesian non-parametrics. The survey covers the use of Bayesian non-parametrics for modelling unknown functions, density estimation, clustering, time-series modelling, and representing sparsity, hierarchies, and covariance structure. More specifically, it gives brief non-technical overviews of Gaussian processes, Dirichlet processes, infinite hidden Markov models, Indian buffet processes, Kingman’s coalescent, Dirichlet diffusion trees and Wishart processes. PMID:23277609

  13. FINE: fisher information nonparametric embedding.

    PubMed

    Carter, Kevin M; Raich, Raviv; Finn, William G; Hero, Alfred O

    2009-11-01

    We consider the problems of clustering, classification, and visualization of high-dimensional data when no straightforward euclidean representation exists. In this paper, we propose using the properties of information geometry and statistical manifolds in order to define similarities between data sets using the Fisher information distance. We will show that this metric can be approximated using entirely nonparametric methods, as the parameterization and geometry of the manifold is generally unknown. Furthermore, by using multidimensional scaling methods, we are able to reconstruct the statistical manifold in a low-dimensional euclidean space; enabling effective learning on the data. As a whole, we refer to our framework as Fisher Information Nonparametric Embedding (FINE) and illustrate its uses on practical problems, including a biomedical application and document classification. PMID:19762935

  14. Two general methods for population pharmacokinetic modeling: non-parametric adaptive grid and non-parametric Bayesian

    PubMed Central

    Neely, Michael; Bartroff, Jay; van Guilder, Michael; Yamada, Walter; Bayard, David; Jelliffe, Roger; Leary, Robert; Chubatiuk, Alyona; Schumitzky, Alan

    2013-01-01

    Population pharmacokinetic (PK) modeling methods can be statistically classified as either parametric or nonparametric (NP). Each classification can be divided into maximum likelihood (ML) or Bayesian (B) approazches. In this paper we discuss the nonparametric case using both maximum likelihood and Bayesian approaches. We present two nonparametric methods for estimating the unknown joint population distribution of model parameter values in a pharmacokinetic/pharmacodynamic (PK/PD) dataset. The first method is the NP Adaptive Grid (NPAG). The second is the NP Bayesian (NPB) algorithm with a stick-breaking process to construct a Dirichlet prior. Our objective is to compare the performance of these two methods using a simulated PK/PD dataset. Our results showed excellent performance of NPAG and NPB in a realistically simulated PK study. This simulation allowed us to have benchmarks in the form of the true population parameters to compare with the estimates produced by the two methods, while incorporating challenges like unbalanced sample times and sample numbers as well as the ability to include the covariate of patient weight. We conclude that both NPML and NPB can be used in realistic PK/PD population analysis problems. The advantages of one versus the other are discussed in the paper. NPAG and NPB are implemented in R and freely available for download within the Pmetrics package from www.lapk.org. PMID:23404393

  15. Statistical sirens: the allure of nonparametrics

    USGS Publications Warehouse

    Johnson, D.H.

    1995-01-01

    Although nonparametric statistical methods have a role to play in the analysis of data, often their virtues are overstated and their deficiencies overlooked. A recent Special Feature in Ecology advocated nonparametric methods because of an erroneously stated advantage that they require no assumptions regarding the distribution underlying the observations. The present paper points out some often-ignored features of nonparametric tests comparing two means, and advocates parameter estimation as a preferred alternative to hypothesis testing in many situations.

  16. A Survey of Non-Exchangeable Priors for Bayesian Nonparametric Models.

    PubMed

    Foti, Nicholas J; Williamson, Sinead A

    2015-02-01

    Dependent nonparametric processes extend distributions over measures, such as the Dirichlet process and the beta process, to give distributions over collections of measures, typically indexed by values in some covariate space. Such models are appropriate priors when exchangeability assumptions do not hold, and instead we want our model to vary fluidly with some set of covariates. Since the concept of dependent nonparametric processes was formalized by MacEachern, there have been a number of models proposed and used in the statistics and machine learning literatures. Many of these models exhibit underlying similarities, an understanding of which, we hope, will help in selecting an appropriate prior, developing new models, and leveraging inference techniques. PMID:26353247

  17. Multiatlas segmentation as nonparametric regression.

    PubMed

    Awate, Suyash P; Whitaker, Ross T

    2014-09-01

    This paper proposes a novel theoretical framework to model and analyze the statistical characteristics of a wide range of segmentation methods that incorporate a database of label maps or atlases; such methods are termed as label fusion or multiatlas segmentation. We model these multiatlas segmentation problems as nonparametric regression problems in the high-dimensional space of image patches. We analyze the nonparametric estimator's convergence behavior that characterizes expected segmentation error as a function of the size of the multiatlas database. We show that this error has an analytic form involving several parameters that are fundamental to the specific segmentation problem (determined by the chosen anatomical structure, imaging modality, registration algorithm, and label-fusion algorithm). We describe how to estimate these parameters and show that several human anatomical structures exhibit the trends modeled analytically. We use these parameter estimates to optimize the regression estimator. We show that the expected error for large database sizes is well predicted by models learned on small databases. Thus, a few expert segmentations can help predict the database sizes required to keep the expected error below a specified tolerance level. Such cost-benefit analysis is crucial for deploying clinical multiatlas segmentation systems. PMID:24802528

  18. Robust location and spread measures for nonparametric probability density function estimation.

    PubMed

    López-Rubio, Ezequiel

    2009-10-01

    Robustness against outliers is a desirable property of any unsupervised learning scheme. In particular, probability density estimators benefit from incorporating this feature. A possible strategy to achieve this goal is to substitute the sample mean and the sample covariance matrix by more robust location and spread estimators. Here we use the L1-median to develop a nonparametric probability density function (PDF) estimator. We prove its most relevant properties, and we show its performance in density estimation and classification applications. PMID:19885963

  19. Computerized Adaptive Testing under Nonparametric IRT Models

    ERIC Educational Resources Information Center

    Xu, Xueli; Douglas, Jeff

    2006-01-01

    Nonparametric item response models have been developed as alternatives to the relatively inflexible parametric item response models. An open question is whether it is possible and practical to administer computerized adaptive testing with nonparametric models. This paper explores the possibility of computerized adaptive testing when using…

  20. Non-Parametric Collision Probability for Low-Velocity Encounters

    NASA Technical Reports Server (NTRS)

    Carpenter, J. Russell

    2007-01-01

    An implicit, but not necessarily obvious, assumption in all of the current techniques for assessing satellite collision probability is that the relative position uncertainty is perfectly correlated in time. If there is any mis-modeling of the dynamics in the propagation of the relative position error covariance matrix, time-wise de-correlation of the uncertainty will increase the probability of collision over a given time interval. The paper gives some examples that illustrate this point. This paper argues that, for the present, Monte Carlo analysis is the best available tool for handling low-velocity encounters, and suggests some techniques for addressing the issues just described. One proposal is for the use of a non-parametric technique that is widely used in actuarial and medical studies. The other suggestion is that accurate process noise models be used in the Monte Carlo trials to which the non-parametric estimate is applied. A further contribution of this paper is a description of how the time-wise decorrelation of uncertainty increases the probability of collision.

  1. Auto covariance computer

    NASA Technical Reports Server (NTRS)

    Hepner, T. E.; Meyers, J. F. (Inventor)

    1985-01-01

    A laser velocimeter covariance processor which calculates the auto covariance and cross covariance functions for a turbulent flow field based on Poisson sampled measurements in time from a laser velocimeter is described. The device will process a block of data that is up to 4096 data points in length and return a 512 point covariance function with 48-bit resolution along with a 512 point histogram of the interarrival times which is used to normalize the covariance function. The device is designed to interface and be controlled by a minicomputer from which the data is received and the results returned. A typical 4096 point computation takes approximately 1.5 seconds to receive the data, compute the covariance function, and return the results to the computer.

  2. Nonparametric Methods in Molecular Biology

    PubMed Central

    Wittkowski, Knut M.; Song, Tingting

    2010-01-01

    In 2003, the completion of the Human Genome Project[1] together with advances in computational resources[2] were expected to launch an era where the genetic and genomic contributions to many common diseases would be found. In the years following, however, researchers became increasingly frustrated as most reported ‘findings’ could not be replicated in independent studies[3]. To improve the signal/noise ratio, it was suggested to increase the number of cases to be included to tens of thousands[4], a requirement that would dramatically restrict the scope of personalized medicine. Similarly, there was little success in elucidating the gene–gene interactions involved in complex diseases or even in developing criteria for assessing their phenotypes. As a partial solution to these enigmata, we here introduce a class of statistical methods as the ‘missing link’ between advances in genetics and informatics. As a first step, we provide a unifying view of a plethora of non-parametric tests developed mainly in the 1940s, all of which can be expressed as u-statistics. Then, we will extend this approach to reflect categorical and ordinal relationships between variables, resulting in a flexible and powerful approach to deal with the impact of (1) multi-allelic genetic loci, (2) poly-locus genetic regions, and (3) oligo-genetic and oligo-genomic collaborative interactions on complex phenotypes. PMID:20652502

  3. An Empirical Investigation of Four Tests for Interaction in the Context of Factorial Analysis of Covariance.

    ERIC Educational Resources Information Center

    Headrick, Todd C.; Vineyard, George

    The Type I error and power properties of the parametric F test and three nonparametric competitors were compared in terms of 3 x 4 factorial analysis of covariance layout. The focus of the study was on the test for interaction either in the presence or absence of main effects. A variety of conditional distributions, sample sizes, levels of variate…

  4. A Review of DIMPACK Version 1.0: Conditional Covariance-Based Test Dimensionality Analysis Package

    ERIC Educational Resources Information Center

    Deng, Nina; Han, Kyung T.; Hambleton, Ronald K.

    2013-01-01

    DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional covariance approach is reviewed. This software was originally distributed by Assessment Systems Corporation and now can be freely accessed online. The software consists of Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC, which…

  5. Galilean covariant harmonic oscillator

    NASA Technical Reports Server (NTRS)

    Horzela, Andrzej; Kapuscik, Edward

    1993-01-01

    A Galilean covariant approach to classical mechanics of a single particle is described. Within the proposed formalism, all non-covariant force laws defining acting forces which become to be defined covariantly by some differential equations are rejected. Such an approach leads out of the standard classical mechanics and gives an example of non-Newtonian mechanics. It is shown that the exactly solvable linear system of differential equations defining forces contains the Galilean covariant description of harmonic oscillator as its particular case. Additionally, it is demonstrated that in Galilean covariant classical mechanics the validity of the second Newton law of dynamics implies the Hooke law and vice versa. It is shown that the kinetic and total energies transform differently with respect to the Galilean transformations.

  6. Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

    PubMed

    Sparapani, Rodney A; Logan, Brent R; McCulloch, Robert E; Laud, Purushottam W

    2016-07-20

    Bayesian additive regression trees (BART) provide a framework for flexible nonparametric modeling of relationships of covariates to outcomes. Recently, BART models have been shown to provide excellent predictive performance, for both continuous and binary outcomes, and exceeding that of its competitors. Software is also readily available for such outcomes. In this article, we introduce modeling that extends the usefulness of BART in medical applications by addressing needs arising in survival analysis. Simulation studies of one-sample and two-sample scenarios, in comparison with long-standing traditional methods, establish face validity of the new approach. We then demonstrate the model's ability to accommodate data from complex regression models with a simulation study of a nonproportional hazards scenario with crossing survival functions and survival function estimation in a scenario where hazards are multiplicatively modified by a highly nonlinear function of the covariates. Using data from a recently published study of patients undergoing hematopoietic stem cell transplantation, we illustrate the use and some advantages of the proposed method in medical investigations. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26854022

  7. Covariant mutually unbiased bases

    NASA Astrophysics Data System (ADS)

    Carmeli, Claudio; Schultz, Jussi; Toigo, Alessandro

    2016-06-01

    The connection between maximal sets of mutually unbiased bases (MUBs) in a prime-power dimensional Hilbert space and finite phase-space geometries is well known. In this article, we classify MUBs according to their degree of covariance with respect to the natural symmetries of a finite phase-space, which are the group of its affine symplectic transformations. We prove that there exist maximal sets of MUBs that are covariant with respect to the full group only in odd prime-power dimensional spaces, and in this case, their equivalence class is actually unique. Despite this limitation, we show that in dimension 2r covariance can still be achieved by restricting to proper subgroups of the symplectic group, that constitute the finite analogues of the oscillator group. For these subgroups, we explicitly construct the unitary operators yielding the covariance.

  8. Comparing Smoothing Techniques for Fitting the Nonlinear Effect of Covariate in Cox Models

    PubMed Central

    Roshani, Daem; Ghaderi, Ebrahim

    2016-01-01

    Background and Objective: Cox model is a popular model in survival analysis, which assumes linearity of the covariate on the log hazard function, While continuous covariates can affect the hazard through more complicated nonlinear functional forms and therefore, Cox models with continuous covariates are prone to misspecification due to not fitting the correct functional form for continuous covariates. In this study, a smooth nonlinear covariate effect would be approximated by different spline functions. Material and Methods: We applied three flexible nonparametric smoothing techniques for nonlinear covariate effect in the Cox models: penalized splines, restricted cubic splines and natural splines. Akaike information criterion (AIC) and degrees of freedom were used to smoothing parameter selection in penalized splines model. The ability of nonparametric methods was evaluated to recover the true functional form of linear, quadratic and nonlinear functions, using different simulated sample sizes. Data analysis was carried out using R 2.11.0 software and significant levels were considered 0.05. Results: Based on AIC, the penalized spline method had consistently lower mean square error compared to others to selection of smoothed parameter. The same result was obtained with real data. Conclusion: Penalized spline smoothing method, with AIC to smoothing parameter selection, was more accurate in evaluate of relation between covariate and log hazard function than other methods. PMID:27041809

  9. AFCI-2.0 Library of Neutron Cross Section Covariances

    SciTech Connect

    Herman, M.; Herman,M.; Oblozinsky,P.; Mattoon,C.; Pigni,M.; Hoblit,S.; Mughabghab,S.F.; Sonzogni,A.; Talou,P.; Chadwick,M.B.; Hale.G.M.; Kahler,A.C.; Kawano,T.; Little,R.C.; Young,P.G.

    2011-06-26

    Neutron cross section covariance library has been under development by BNL-LANL collaborative effort over the last three years. The primary purpose of the library is to provide covariances for the Advanced Fuel Cycle Initiative (AFCI) data adjustment project, which is focusing on the needs of fast advanced burner reactors. The covariances refer to central values given in the 2006 release of the U.S. neutron evaluated library ENDF/B-VII. The preliminary version (AFCI-2.0beta) has been completed in October 2010 and made available to the users for comments. In the final 2.0 release, covariances for a few materials were updated, in particular new LANL evaluations for {sup 238,240}Pu and {sup 241}Am were adopted. BNL was responsible for covariances for structural materials and fission products, management of the library and coordination of the work, while LANL was in charge of covariances for light nuclei and for actinides.

  10. A Comparison of Bias Correction Adjustments for the DETECT Procedure

    ERIC Educational Resources Information Center

    Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei

    2011-01-01

    DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…

  11. Covariance mapping techniques

    NASA Astrophysics Data System (ADS)

    Frasinski, Leszek J.

    2016-08-01

    Recent technological advances in the generation of intense femtosecond pulses have made covariance mapping an attractive analytical technique. The laser pulses available are so intense that often thousands of ionisation and Coulomb explosion events will occur within each pulse. To understand the physics of these processes the photoelectrons and photoions need to be correlated, and covariance mapping is well suited for operating at the high counting rates of these laser sources. Partial covariance is particularly useful in experiments with x-ray free electron lasers, because it is capable of suppressing pulse fluctuation effects. A variety of covariance mapping methods is described: simple, partial (single- and multi-parameter), sliced, contingent and multi-dimensional. The relationship to coincidence techniques is discussed. Covariance mapping has been used in many areas of science and technology: inner-shell excitation and Auger decay, multiphoton and multielectron ionisation, time-of-flight and angle-resolved spectrometry, infrared spectroscopy, nuclear magnetic resonance imaging, stimulated Raman scattering, directional gamma ray sensing, welding diagnostics and brain connectivity studies (connectomics). This review gives practical advice for implementing the technique and interpreting the results, including its limitations and instrumental constraints. It also summarises recent theoretical studies, highlights unsolved problems and outlines a personal view on the most promising research directions.

  12. Misunderstanding analysis of covariance.

    PubMed

    Miller, G A; Chapman, J P

    2001-02-01

    Despite numerous technical treatments in many venues, analysis of covariance (ANCOVA) remains a widely misused approach to dealing with substantive group differences on potential covariates, particularly in psychopathology research. Published articles reach unfounded conclusions, and some statistics texts neglect the issue. The problem with ANCOVA in such cases is reviewed. In many cases, there is no means of achieving the superficially appealing goal of "correcting" or "controlling for" real group differences on a potential covariate. In hopes of curtailing misuse of ANCOVA and promoting appropriate use, a nontechnical discussion is provided, emphasizing a substantive confound rarely articulated in textbooks and other general presentations, to complement the mathematical critiques already available. Some alternatives are discussed for contexts in which ANCOVA is inappropriate or questionable. PMID:11261398

  13. A Comparison of Parametric versus Nonparametric Statistics.

    ERIC Educational Resources Information Center

    Royeen, Charlotte Brasic

    In order to examine the possible effects of violation of assumptions using parametric procedures, this study is an exploratory investigation into the use of parametric versus nonparametric procedures using a multiple case study design. The case study investigation guidelines outlined by Yin served as the methodology. The following univariate…

  14. Nonparametric analysis of high wind speed data

    NASA Astrophysics Data System (ADS)

    Francisco-Fernández, Mario; Quintela-del-Río, Alejandro

    2013-01-01

    In this paper, nonparametric curve estimation methods are applied to analyze time series of wind speeds, focusing on the extreme events exceeding a chosen threshold. Classical parametric statistical approaches in this context consist in fitting a generalized Pareto distribution (GPD) to the tail of the empirical cumulative distribution, using maximum likelihood or the method of the moments to estimate the parameters of this distribution. Additionally, confidence intervals are usually computed to assess the uncertainty of the estimates. Nonparametric methods to estimate directly some quantities of interest, such as the probability of exceedance, the quantiles or return levels, or the return periods, are proposed. Moreover, bootstrap techniques are used to develop pointwise and simultaneous confidence intervals for these functions. The proposed models are applied to wind speed data in the Gulf Coast of US, comparing the results with those using the GPD approach, by means of a split-sample test. Results show that nonparametric methods are competitive with respect to the standard GPD approximations. The study is completed generating synthetic data sets and comparing the behavior of the parametric and the nonparametric estimates in this framework.

  15. Nonparametric Methods Instruction in Quantitative Geology.

    ERIC Educational Resources Information Center

    Kemmerly, Phillip Randall

    1990-01-01

    Presented is an approach to introducing upper division, undergraduate geology students to nonparametric statistics and their application to geologic data. Discussed are the use of the Mann-Whitney U and the Kolmogorov-Smirnov tests and a class assignment which illustrates their use. (CW)

  16. How Are Teachers Teaching? A Nonparametric Approach

    ERIC Educational Resources Information Center

    De Witte, Kristof; Van Klaveren, Chris

    2014-01-01

    This paper examines which configuration of teaching activities maximizes student performance. For this purpose a nonparametric efficiency model is formulated that accounts for (1) self-selection of students and teachers in better schools and (2) complementary teaching activities. The analysis distinguishes both individual teaching (i.e., a…

  17. The covariant chiral ring

    NASA Astrophysics Data System (ADS)

    Bourget, Antoine; Troost, Jan

    2016-03-01

    We construct a covariant generating function for the spectrum of chiral primaries of symmetric orbifold conformal field theories with N = (4 , 4) supersymmetry in two dimensions. For seed target spaces K3 and T 4, the generating functions capture the SO(21) and SO(5) representation theoretic content of the chiral ring respectively. Via string dualities, we relate the transformation properties of the chiral ring under these isometries of the moduli space to the Lorentz covariance of perturbative string partition functions in flat space.

  18. A Basic Computer Program for Calculating Simultaneous Pairwise Comparisons in Analysis of Covariance.

    ERIC Educational Resources Information Center

    Powers, Stephen; Jones, Patricia

    1986-01-01

    This paper describes a computer program which tests all pairwise comparisons of adjusted means in analysis of covariance by using Tukey-Kramer Test. The program contains: means of covariate, adjusted means of the criterion measure, sample size, mean square error, and the desired percentile point on the Studentized range distribution. (JAZ)

  19. Generalized Linear Covariance Analysis

    NASA Technical Reports Server (NTRS)

    Carpenter, James R.; Markley, F. Landis

    2014-01-01

    This talk presents a comprehensive approach to filter modeling for generalized covariance analysis of both batch least-squares and sequential estimators. We review and extend in two directions the results of prior work that allowed for partitioning of the state space into solve-for'' and consider'' parameters, accounted for differences between the formal values and the true values of the measurement noise, process noise, and textita priori solve-for and consider covariances, and explicitly partitioned the errors into subspaces containing only the influence of the measurement noise, process noise, and solve-for and consider covariances. In this work, we explicitly add sensitivity analysis to this prior work, and relax an implicit assumption that the batch estimator's epoch time occurs prior to the definitive span. We also apply the method to an integrated orbit and attitude problem, in which gyro and accelerometer errors, though not estimated, influence the orbit determination performance. We illustrate our results using two graphical presentations, which we call the variance sandpile'' and the sensitivity mosaic,'' and we compare the linear covariance results to confidence intervals associated with ensemble statistics from a Monte Carlo analysis.

  20. Generalized Linear Covariance Analysis

    NASA Technical Reports Server (NTRS)

    Carpenter, J. Russell; Markley, F. Landis

    2008-01-01

    We review and extend in two directions the results of prior work on generalized covariance analysis methods. This prior work allowed for partitioning of the state space into "solve-for" and "consider" parameters, allowed for differences between the formal values and the true values of the measurement noise, process noise, and a priori solve-for and consider covariances, and explicitly partitioned the errors into subspaces containing only the influence of the measurement noise, process noise, and a priori solve-for and consider covariances. In this work, we explicitly add sensitivity analysis to this prior work, and relax an implicit assumption that the batch estimator s anchor time occurs prior to the definitive span. We also apply the method to an integrated orbit and attitude problem, in which gyro and accelerometer errors, though not estimated, influence the orbit determination performance. We illustrate our results using two graphical presentations, which we call the "variance sandpile" and the "sensitivity mosaic," and we compare the linear covariance results to confidence intervals associated with ensemble statistics from a Monte Carlo analysis.

  1. An Evaluation of Parametric and Nonparametric Models of Fish Population Response.

    SciTech Connect

    Haas, Timothy C.; Peterson, James T.; Lee, Danny C.

    1999-11-01

    Predicting the distribution or status of animal populations at large scales often requires the use of broad-scale information describing landforms, climate, vegetation, etc. These data, however, often consist of mixtures of continuous and categorical covariates and nonmultiplicative interactions among covariates, complicating statistical analyses. Using data from the interior Columbia River Basin, USA, we compared four methods for predicting the distribution of seven salmonid taxa using landscape information. Subwatersheds (mean size, 7800 ha) were characterized using a set of 12 covariates describing physiography, vegetation, and current land-use. The techniques included generalized logit modeling, classification trees, a nearest neighbor technique, and a modular neural network. We evaluated model performance using out-of-sample prediction accuracy via leave-one-out cross-validation and introduce a computer-intensive Monte Carlo hypothesis testing approach for examining the statistical significance of landscape covariates with the non-parametric methods. We found the modular neural network and the nearest-neighbor techniques to be the most accurate, but were difficult to summarize in ways that provided ecological insight. The modular neural network also required the most extensive computer resources for model fitting and hypothesis testing. The generalized logit models were readily interpretable, but were the least accurate, possibly due to nonlinear relationships and nonmultiplicative interactions among covariates. Substantial overlap among the statistically significant (P<0.05) covariates for each method suggested that each is capable of detecting similar relationships between responses and covariates. Consequently, we believe that employing one or more methods may provide greater biological insight without sacrificing prediction accuracy.

  2. Using Analysis of Covariance (ANCOVA) with Fallible Covariates

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew; Aguinis, Herman

    2011-01-01

    Analysis of covariance (ANCOVA) is used widely in psychological research implementing nonexperimental designs. However, when covariates are fallible (i.e., measured with error), which is the norm, researchers must choose from among 3 inadequate courses of action: (a) know that the assumption that covariates are perfectly reliable is violated but…

  3. Nonparametric Bayesian Variable Selection With Applications to Multiple Quantitative Trait Loci Mapping With Epistasis and Gene–Environment Interaction

    PubMed Central

    Zou, Fei; Huang, Hanwen; Lee, Seunggeun; Hoeschele, Ina

    2010-01-01

    The joint action of multiple genes is an important source of variation for complex traits and human diseases. However, mapping genes with epistatic effects and gene–environment interactions is a difficult problem because of relatively small sample sizes and very large parameter spaces for quantitative trait locus models that include such interactions. Here we present a nonparametric Bayesian method to map multiple quantitative trait loci (QTL) by considering epistatic and gene–environment interactions. The proposed method is not restricted to pairwise interactions among genes, as is typically done in parametric QTL analysis. Rather than modeling each main and interaction term explicitly, our nonparametric Bayesian method measures the importance of each QTL, irrespective of whether it is mostly due to a main effect or due to some interaction effect(s), via an unspecified function of the genotypes at all candidate QTL. A Gaussian process prior is assigned to this unknown function. In addition to the candidate QTL, nongenetic factors and covariates, such as age, gender, and environmental conditions, can also be included in the unspecified function. The importance of each genetic factor (QTL) and each nongenetic factor/covariate included in the function is estimated by a single hyperparameter, which enters the covariance function and captures any main or interaction effect associated with a given factor/covariate. An initial evaluation of the performance of the proposed method is obtained via analysis of simulated and real data. PMID:20551445

  4. Adaptive Confidence Bands for Nonparametric Regression Functions

    PubMed Central

    Cai, T. Tony; Low, Mark; Ma, Zongming

    2014-01-01

    A new formulation for the construction of adaptive confidence bands in non-parametric function estimation problems is proposed. Confidence bands are constructed which have size that adapts to the smoothness of the function while guaranteeing that both the relative excess mass of the function lying outside the band and the measure of the set of points where the function lies outside the band are small. It is shown that the bands adapt over a maximum range of Lipschitz classes. The adaptive confidence band can be easily implemented in standard statistical software with wavelet support. Numerical performance of the procedure is investigated using both simulated and real datasets. The numerical results agree well with the theoretical analysis. The procedure can be easily modified and used for other nonparametric function estimation models. PMID:26269661

  5. Nonparametric estimation with recurrent competing risks data

    PubMed Central

    Peña, Edsel A.

    2014-01-01

    Nonparametric estimators of component and system life distributions are developed and presented for situations where recurrent competing risks data from series systems are available. The use of recurrences of components’ failures leads to improved efficiencies in statistical inference, thereby leading to resource-efficient experimental or study designs or improved inferences about the distributions governing the event times. Finite and asymptotic properties of the estimators are obtained through simulation studies and analytically. The detrimental impact of parametric model misspecification is also vividly demonstrated, lending credence to the virtue of adopting nonparametric or semiparametric models, especially in biomedical settings. The estimators are illustrated by applying them to a data set pertaining to car repairs for vehicles that were under warranty. PMID:24072583

  6. A nonparametric software reliability growth model

    NASA Technical Reports Server (NTRS)

    Miller, Douglas R.; Sofer, Ariela

    1988-01-01

    Miller and Sofer have presented a nonparametric method for estimating the failure rate of a software program. The method is based on the complete monotonicity property of the failure rate function, and uses a regression approach to obtain estimates of the current software failure rate. This completely monotone software model is extended. It is shown how it can also provide long-range predictions of future reliability growth. Preliminary testing indicates that the method is competitive with parametric approaches, while being more robust.

  7. Covariant approximation averaging

    NASA Astrophysics Data System (ADS)

    Shintani, Eigo; Arthur, Rudy; Blum, Thomas; Izubuchi, Taku; Jung, Chulwoo; Lehner, Christoph

    2015-06-01

    We present a new class of statistical error reduction techniques for Monte Carlo simulations. Using covariant symmetries, we show that correlation functions can be constructed from inexpensive approximations without introducing any systematic bias in the final result. We introduce a new class of covariant approximation averaging techniques, known as all-mode averaging (AMA), in which the approximation takes account of contributions of all eigenmodes through the inverse of the Dirac operator computed from the conjugate gradient method with a relaxed stopping condition. In this paper we compare the performance and computational cost of our new method with traditional methods using correlation functions and masses of the pion, nucleon, and vector meson in Nf=2 +1 lattice QCD using domain-wall fermions. This comparison indicates that AMA significantly reduces statistical errors in Monte Carlo calculations over conventional methods for the same cost.

  8. Covariant deformed oscillator algebras

    NASA Technical Reports Server (NTRS)

    Quesne, Christiane

    1995-01-01

    The general form and associativity conditions of deformed oscillator algebras are reviewed. It is shown how the latter can be fulfilled in terms of a solution of the Yang-Baxter equation when this solution has three distinct eigenvalues and satisfies a Birman-Wenzl-Murakami condition. As an example, an SU(sub q)(n) x SU(sub q)(m)-covariant q-bosonic algebra is discussed in some detail.

  9. The Bayesian Covariance Lasso

    PubMed Central

    Khondker, Zakaria S; Zhu, Hongtu; Chu, Haitao; Lin, Weili; Ibrahim, Joseph G.

    2012-01-01

    Estimation of sparse covariance matrices and their inverse subject to positive definiteness constraints has drawn a lot of attention in recent years. The abundance of high-dimensional data, where the sample size (n) is less than the dimension (d), requires shrinkage estimation methods since the maximum likelihood estimator is not positive definite in this case. Furthermore, when n is larger than d but not sufficiently larger, shrinkage estimation is more stable than maximum likelihood as it reduces the condition number of the precision matrix. Frequentist methods have utilized penalized likelihood methods, whereas Bayesian approaches rely on matrix decompositions or Wishart priors for shrinkage. In this paper we propose a new method, called the Bayesian Covariance Lasso (BCLASSO), for the shrinkage estimation of a precision (covariance) matrix. We consider a class of priors for the precision matrix that leads to the popular frequentist penalties as special cases, develop a Bayes estimator for the precision matrix, and propose an efficient sampling scheme that does not precalculate boundaries for positive definiteness. The proposed method is permutation invariant and performs shrinkage and estimation simultaneously for non-full rank data. Simulations show that the proposed BCLASSO performs similarly as frequentist methods for non-full rank data. PMID:24551316

  10. Estimating the extreme low-temperature event using nonparametric methods

    NASA Astrophysics Data System (ADS)

    D'Silva, Anisha

    This thesis presents a new method of estimating the one-in-N low temperature threshold using a non-parametric statistical method called kernel density estimation applied to daily average wind-adjusted temperatures. We apply our One-in-N Algorithm to local gas distribution companies (LDCs), as they have to forecast the daily natural gas needs of their consumers. In winter, demand for natural gas is high. Extreme low temperature events are not directly related to an LDCs gas demand forecasting, but knowledge of extreme low temperatures is important to ensure that an LDC has enough capacity to meet customer demands when extreme low temperatures are experienced. We present a detailed explanation of our One-in-N Algorithm and compare it to the methods using the generalized extreme value distribution, the normal distribution, and the variance-weighted composite distribution. We show that our One-in-N Algorithm estimates the one-in- N low temperature threshold more accurately than the methods using the generalized extreme value distribution, the normal distribution, and the variance-weighted composite distribution according to root mean square error (RMSE) measure at a 5% level of significance. The One-in- N Algorithm is tested by counting the number of times the daily average wind-adjusted temperature is less than or equal to the one-in- N low temperature threshold.

  11. Earth Observing System Covariance Realism

    NASA Technical Reports Server (NTRS)

    Zaidi, Waqar H.; Hejduk, Matthew D.

    2016-01-01

    The purpose of covariance realism is to properly size a primary object's covariance in order to add validity to the calculation of the probability of collision. The covariance realism technique in this paper consists of three parts: collection/calculation of definitive state estimates through orbit determination, calculation of covariance realism test statistics at each covariance propagation point, and proper assessment of those test statistics. An empirical cumulative distribution function (ECDF) Goodness-of-Fit (GOF) method is employed to determine if a covariance is properly sized by comparing the empirical distribution of Mahalanobis distance calculations to the hypothesized parent 3-DoF chi-squared distribution. To realistically size a covariance for collision probability calculations, this study uses a state noise compensation algorithm that adds process noise to the definitive epoch covariance to account for uncertainty in the force model. Process noise is added until the GOF tests pass a group significance level threshold. The results of this study indicate that when outliers attributed to persistently high or extreme levels of solar activity are removed, the aforementioned covariance realism compensation method produces a tuned covariance with up to 80 to 90% of the covariance propagation timespan passing (against a 60% minimum passing threshold) the GOF tests-a quite satisfactory and useful result.

  12. Low-Fidelity Covariances: Neutron Cross Section Covariance Estimates for 387 Materials

    DOE Data Explorer

    The Low-fidelity Covariance Project (Low-Fi) was funded in FY07-08 by DOEÆs Nuclear Criticality Safety Program (NCSP). The project was a collaboration among ANL, BNL, LANL, and ORNL. The motivation for the Low-Fi project stemmed from an imbalance in supply and demand of covariance data. The interest in, and demand for, covariance data has been in a continual uptrend over the past few years. Requirements to understand application-dependent uncertainties in simulated quantities of interest have led to the development of sensitivity / uncertainty and data adjustment software such as TSUNAMI [1] at Oak Ridge. To take full advantage of the capabilities of TSUNAMI requires general availability of covariance data. However, the supply of covariance data has not been able to keep up with the demand. This fact is highlighted by the observation that the recent release of the much-heralded ENDF/B-VII.0 included covariance data for only 26 of the 393 neutron evaluations (which is, in fact, considerably less covariance data than was included in the final ENDF/B-VI release).[Copied from R.C. Little et al., "Low-Fidelity Covariance Project", Nuclear Data Sheets 109 (2008) 2828-2833] The Low-Fi covariance data are now available at the National Nuclear Data Center. They are separate from ENDF/B-VII.0 and the NNDC warns that this information is not approved by CSEWG. NNDC describes the contents of this collection as: "Covariance data are provided for radiative capture (or (n,ch.p.) for light nuclei), elastic scattering (or total for some actinides), inelastic scattering, (n,2n) reactions, fission and nubars over the energy range from 10(-5{super}) eV to 20 MeV. The library contains 387 files including almost all (383 out of 393) materials of the ENDF/B-VII.0. Absent are data for (7{super})Li, (232{super})Th, (233,235,238{super})U and (239{super})Pu as well as (223,224,225,226{super})Ra, while (nat{super})Zn is replaced by (64,66,67,68,70{super})Zn

  13. Covariant magnetic connection hypersurfaces

    NASA Astrophysics Data System (ADS)

    Pegoraro, F.

    2016-04-01

    > In the single fluid, non-relativistic, ideal magnetohydrodynamic (MHD) plasma description, magnetic field lines play a fundamental role by defining dynamically preserved `magnetic connections' between plasma elements. Here we show how the concept of magnetic connection needs to be generalized in the case of a relativistic MHD description where we require covariance under arbitrary Lorentz transformations. This is performed by defining 2-D magnetic connection hypersurfaces in the 4-D Minkowski space. This generalization accounts for the loss of simultaneity between spatially separated events in different frames and is expected to provide a powerful insight into the 4-D geometry of electromagnetic fields when .

  14. Flexible estimation of covariance function by penalized spline with application to longitudinal family data

    PubMed Central

    Wang, Yuanjia

    2011-01-01

    Longitudinal data are routinely collected in biomedical research studies. A natural model describing longitudinal data decomposes an individual’s outcome as the sum of a population mean function and random subject-specific deviations. When parametric assumptions are too restrictive, methods modeling the population mean function and the random subject-specific functions nonparametrically are in demand. In some applications, it is desirable to estimate a covariance function of random subject-specific deviations. In this work, flexible yet computationally efficient methods are developed for a general class of semiparametric mixed effects models, where the functional forms of the population mean and the subject-specific curves are unspecified. We estimate nonparametric components of the model by penalized spline (P-spline, [1]), and reparametrize the random curve covariance function by a modified Cholesky decomposition [2] which allows for unconstrained estimation of a positive semidefinite matrix. To provide smooth estimates, we penalize roughness of fitted curves and derive closed form solutions in the maximization step of an EM algorithm. In addition, we present models and methods for longitudinal family data where subjects in a family are correlated and we decompose the covariance function into a subject-level source and observation-level source. We apply these methods to the multi-level Framingham Heart Study data to estimate age-specific heritability of systolic blood pressure (SBP) nonparametrically. PMID:21491474

  15. Nonparametric, nonnegative deconvolution of large time series

    NASA Astrophysics Data System (ADS)

    Cirpka, O. A.

    2006-12-01

    There is a long tradition of characterizing hydrologic systems by linear models, in which the response of the system to a time-varying stimulus is computed by convolution of a system-specific transfer function with the input signal. Despite its limitations, the transfer-function concept has been shown valuable for many situations such as the precipitation/run-off relationships of catchments and solute transport in agricultural soils and aquifers. A practical difficulty lies in the identification of the transfer function. A common approach is to fit a parametric function, enforcing a particular shape of the transfer function, which may be in contradiction to the real behavior (e.g., multimodal transfer functions, long tails, etc.). In our nonparametric deconvolution, the transfer function is assumed an auto-correlated random time function, which is conditioned on the data by a Bayesian approach. Nonnegativity, which is a vital constraint for solute-transport applications, is enforced by the method of Lagrange multipliers. This makes the inverse problem nonlinear. In nonparametric deconvolution, identifying the auto-correlation parameters is crucial. Enforcing too much smoothness prohibits the identification of important features, whereas insufficient smoothing leads to physically meaningless transfer functions, mapping noise components in the two data series onto each other. We identify optimal smoothness parameters by the expectation-maximization method, which requires the repeated generation of many conditional realizations. The overall approach, however, is still significantly faster than Markov-Chain Monte-Carlo methods presented recently. We apply our approach to electric-conductivity time series measured in a river and monitoring wells in the adjacent aquifer. The data cover 1.5 years with a temporal resolution of 1h. The identified transfer functions have lengths of up to 60 days, making up 1440 parameters. We believe that nonparametric deconvolution is an

  16. Bayesian nonparametric models for ranked set sampling.

    PubMed

    Gemayel, Nader; Stasny, Elizabeth A; Wolfe, Douglas A

    2015-04-01

    Ranked set sampling (RSS) is a data collection technique that combines measurement with judgment ranking for statistical inference. This paper lays out a formal and natural Bayesian framework for RSS that is analogous to its frequentist justification, and that does not require the assumption of perfect ranking or use of any imperfect ranking models. Prior beliefs about the judgment order statistic distributions and their interdependence are embodied by a nonparametric prior distribution. Posterior inference is carried out by means of Markov chain Monte Carlo techniques, and yields estimators of the judgment order statistic distributions (and of functionals of those distributions). PMID:25326663

  17. Lottery Spending: A Non-Parametric Analysis

    PubMed Central

    Garibaldi, Skip; Frisoli, Kayla; Ke, Li; Lim, Melody

    2015-01-01

    We analyze the spending of individuals in the United States on lottery tickets in an average month, as reported in surveys. We view these surveys as sampling from an unknown distribution, and we use non-parametric methods to compare properties of this distribution for various demographic groups, as well as claims that some properties of this distribution are constant across surveys. We find that the observed higher spending by Hispanic lottery players can be attributed to differences in education levels, and we dispute previous claims that the top 10% of lottery players consistently account for 50% of lottery sales. PMID:25642699

  18. Lottery spending: a non-parametric analysis.

    PubMed

    Garibaldi, Skip; Frisoli, Kayla; Ke, Li; Lim, Melody

    2015-01-01

    We analyze the spending of individuals in the United States on lottery tickets in an average month, as reported in surveys. We view these surveys as sampling from an unknown distribution, and we use non-parametric methods to compare properties of this distribution for various demographic groups, as well as claims that some properties of this distribution are constant across surveys. We find that the observed higher spending by Hispanic lottery players can be attributed to differences in education levels, and we dispute previous claims that the top 10% of lottery players consistently account for 50% of lottery sales. PMID:25642699

  19. Bayesian Nonparametric Inference – Why and How

    PubMed Central

    Müller, Peter; Mitra, Riten

    2013-01-01

    We review inference under models with nonparametric Bayesian (BNP) priors. The discussion follows a set of examples for some common inference problems. The examples are chosen to highlight problems that are challenging for standard parametric inference. We discuss inference for density estimation, clustering, regression and for mixed effects models with random effects distributions. While we focus on arguing for the need for the flexibility of BNP models, we also review some of the more commonly used BNP models, thus hopefully answering a bit of both questions, why and how to use BNP. PMID:24368932

  20. A nonparametric software-reliability growth model

    NASA Technical Reports Server (NTRS)

    Sofer, Ariela; Miller, Douglas R.

    1991-01-01

    The authors (1985) previously introduced a nonparametric model for software-reliability growth which is based on complete monotonicity of the failure rate. The authors extend the completely monotone software model by developing a method for providing long-range predictions of reliability growth, based on the model. They derive upper and lower bounds on extrapolation of the failure rate and the mean function. These are then used to obtain estimates for the future software failure rate and the mean future number of failures. Preliminary evaluation indicates that the method is competitive with parametric approaches, while being more robust.

  1. Correcting eddy-covariance flux underestimates over a grassland.

    SciTech Connect

    Twine, T. E.; Kustas, W. P.; Norman, J. M.; Cook, D. R.; Houser, P. R.; Meyers, T. P.; Prueger, J. H.; Starks, P. J.; Wesely, M. L.; Environmental Research; Univ. of Wisconsin at Madison; DOE; National Aeronautics and Space Administration; National Oceanic and Atmospheric Administrationoratory

    2000-06-08

    Independent measurements of the major energy balance flux components are not often consistent with the principle of conservation of energy. This is referred to as a lack of closure of the surface energy balance. Most results in the literature have shown the sum of sensible and latent heat fluxes measured by eddy covariance to be less than the difference between net radiation and soil heat fluxes. This under-measurement of sensible and latent heat fluxes by eddy-covariance instruments has occurred in numerous field experiments and among many different manufacturers of instruments. Four eddy-covariance systems consisting of the same models of instruments were set up side-by-side during the Southern Great Plains 1997 Hydrology Experiment and all systems under-measured fluxes by similar amounts. One of these eddy-covariance systems was collocated with three other types of eddy-covariance systems at different sites; all of these systems under-measured the sensible and latent-heat fluxes. The net radiometers and soil heat flux plates used in conjunction with the eddy-covariance systems were calibrated independently and measurements of net radiation and soil heat flux showed little scatter for various sites. The 10% absolute uncertainty in available energy measurements was considerably smaller than the systematic closure problem in the surface energy budget, which varied from 10 to 30%. When available-energy measurement errors are known and modest, eddy-covariance measurements of sensible and latent heat fluxes should be adjusted for closure. Although the preferred method of energy balance closure is to maintain the Bowen-ratio, the method for obtaining closure appears to be less important than assuring that eddy-covariance measurements are consistent with conservation of energy. Based on numerous measurements over a sorghum canopy, carbon dioxide fluxes, which are measured by eddy covariance, are underestimated by the same factor as eddy covariance evaporation

  2. Nonparametric discriminant analysis for face recognition.

    PubMed

    Li, Zhifeng; Lin, Dahua; Tang, Xiaoou

    2009-04-01

    In this paper, we develop a new framework for face recognition based on nonparametric discriminant analysis (NDA) and multi-classifier integration. Traditional LDA-based methods suffer a fundamental limitation originating from the parametric nature of scatter matrices, which are based on the Gaussian distribution assumption. The performance of these methods notably degrades when the actual distribution is Non-Gaussian. To address this problem, we propose a new formulation of scatter matrices to extend the two-class nonparametric discriminant analysis to multi-class cases. Then, we develop two more improved multi-class NDA-based algorithms (NSA and NFA) with each one having two complementary methods based on the principal space and the null space of the intra-class scatter matrix respectively. Comparing to the NSA, the NFA is more effective in the utilization of the classification boundary information. In order to exploit the complementary nature of the two kinds of NFA (PNFA and NNFA), we finally develop a dual NFA-based multi-classifier fusion framework by employing the over complete Gabor representation to boost the recognition performance. We show the improvements of the developed new algorithms over the traditional subspace methods through comparative experiments on two challenging face databases, Purdue AR database and XM2VTS database. PMID:19229090

  3. Bayesian Nonparametric Clustering for Positive Definite Matrices.

    PubMed

    Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos

    2016-05-01

    Symmetric Positive Definite (SPD) matrices emerge as data descriptors in several applications of computer vision such as object tracking, texture recognition, and diffusion tensor imaging. Clustering these data matrices forms an integral part of these applications, for which soft-clustering algorithms (K-Means, expectation maximization, etc.) are generally used. As is well-known, these algorithms need the number of clusters to be specified, which is difficult when the dataset scales. To address this issue, we resort to the classical nonparametric Bayesian framework by modeling the data as a mixture model using the Dirichlet process (DP) prior. Since these matrices do not conform to the Euclidean geometry, rather belongs to a curved Riemannian manifold,existing DP models cannot be directly applied. Thus, in this paper, we propose a novel DP mixture model framework for SPD matrices. Using the log-determinant divergence as the underlying dissimilarity measure to compare these matrices, and further using the connection between this measure and the Wishart distribution, we derive a novel DPM model based on the Wishart-Inverse-Wishart conjugate pair. We apply this model to several applications in computer vision. Our experiments demonstrate that our model is scalable to the dataset size and at the same time achieves superior accuracy compared to several state-of-the-art parametric and nonparametric clustering algorithms. PMID:27046838

  4. Stardust Navigation Covariance Analysis

    NASA Astrophysics Data System (ADS)

    Menon, Premkumar R.

    2000-01-01

    The Stardust spacecraft was launched on February 7, 1999 aboard a Boeing Delta-II rocket. Mission participants include the National Aeronautics and Space Administration (NASA), the Jet Propulsion Laboratory (JPL), Lockheed Martin Astronautics (LMA) and the University of Washington. The primary objective of the mission is to collect in-situ samples of the coma of comet Wild-2 and return those samples to the Earth for analysis. Mission design and operational navigation for Stardust is performed by the Jet Propulsion Laboratory (JPL). This paper will describe the extensive JPL effort in support of the Stardust pre-launch analysis of the orbit determination component of the mission covariance study. A description of the mission and it's trajectory will be provided first, followed by a discussion of the covariance procedure and models. Predicted accuracy's will be examined as they relate to navigation delivery requirements for specific critical events during the mission. Stardust was launched into a heliocentric trajectory in early 1999. It will perform an Earth Gravity Assist (EGA) on January 15, 2001 to acquire an orbit for the eventual rendezvous with comet Wild-2. The spacecraft will fly through the coma (atmosphere) on the dayside of Wild-2 on January 2, 2004. At that time samples will be obtained using an aerogel collector. After the comet encounter Stardust will return to Earth when the Sample Return Capsule (SRC) will separate and land at the Utah Test Site (UTTR) on January 15, 2006. The spacecraft will however be deflected off into a heliocentric orbit. The mission is divided into three phases for the covariance analysis. They are 1) Launch to EGA, 2) EGA to Wild-2 encounter and 3) Wild-2 encounter to Earth reentry. Orbit determination assumptions for each phase are provided. These include estimated and consider parameters and their associated a-priori uncertainties. Major perturbations to the trajectory include 19 deterministic and statistical maneuvers

  5. Nonparametric identification and maximum likelihood estimation for hidden Markov models

    PubMed Central

    Alexandrovich, G.; Holzmann, H.; Leister, A.

    2016-01-01

    Nonparametric identification and maximum likelihood estimation for finite-state hidden Markov models are investigated. We obtain identification of the parameters as well as the order of the Markov chain if the transition probability matrices have full-rank and are ergodic, and if the state-dependent distributions are all distinct, but not necessarily linearly independent. Based on this identification result, we develop a nonparametric maximum likelihood estimation theory. First, we show that the asymptotic contrast, the Kullback–Leibler divergence of the hidden Markov model, also identifies the true parameter vector nonparametrically. Second, for classes of state-dependent densities which are arbitrary mixtures of a parametric family, we establish the consistency of the nonparametric maximum likelihood estimator. Here, identification of the mixing distributions need not be assumed. Numerical properties of the estimates and of nonparametric goodness of fit tests are investigated in a simulation study.

  6. Nonparametric k-nearest-neighbor entropy estimator.

    PubMed

    Lombardi, Damiano; Pant, Sanjay

    2016-01-01

    A nonparametric k-nearest-neighbor-based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering nonuniform probability densities in the region of k-nearest neighbors around each sample point. It aims to improve the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator, and a significant improvement is demonstrated. PMID:26871193

  7. A nonparametric approach for European option valuation

    NASA Astrophysics Data System (ADS)

    Huang, Guanghui; Wan, Jianping

    2008-04-01

    A nonparametric approach for European option valuation is proposed in this paper, which adopts a purely jump model to describe the price dynamics of the underlying asset, and the minimal entropy martingale measure for those jumps is used as the pricing measure of this market. A simple Monte Carlo simulation method is proposed to calculate the price of derivatives under this risk neural measure. And the volatility of the spot market can be renewed automatically without particular specification in the proposed method. The performances of the proposed method are compared to that of the Black-Scholes formula in an artificial world and the real world. The results of our investigations suggest that the proposed method is a valuable method.

  8. A Nonparametric Bayesian Model for Nested Clustering.

    PubMed

    Lee, Juhee; Müller, Peter; Zhu, Yitan; Ji, Yuan

    2016-01-01

    We propose a nonparametric Bayesian model for clustering where clusters of experimental units are determined by a shared pattern of clustering another set of experimental units. The proposed model is motivated by the analysis of protein activation data, where we cluster proteins such that all proteins in one cluster give rise to the same clustering of patients. That is, we define clusters of proteins by the way that patients group with respect to the corresponding protein activations. This is in contrast to (almost) all currently available models that use shared parameters in the sampling model to define clusters. This includes in particular model based clustering, Dirichlet process mixtures, product partition models, and more. We show results for two typical biostatistical inference problems that give rise to clustering. PMID:26519174

  9. Nonparametric dark energy reconstruction from supernova data.

    PubMed

    Holsclaw, Tracy; Alam, Ujjaini; Sansó, Bruno; Lee, Herbert; Heitmann, Katrin; Habib, Salman; Higdon, David

    2010-12-10

    Understanding the origin of the accelerated expansion of the Universe poses one of the greatest challenges in physics today. Lacking a compelling fundamental theory to test, observational efforts are targeted at a better characterization of the underlying cause. If a new form of mass-energy, dark energy, is driving the acceleration, the redshift evolution of the equation of state parameter w(z) will hold essential clues as to its origin. To best exploit data from observations it is necessary to develop a robust and accurate reconstruction approach, with controlled errors, for w(z). We introduce a new, nonparametric method for solving the associated statistical inverse problem based on Gaussian process modeling and Markov chain Monte Carlo sampling. Applying this method to recent supernova measurements, we reconstruct the continuous history of w out to redshift z=1.5. PMID:21231517

  10. Nonparametric k -nearest-neighbor entropy estimator

    NASA Astrophysics Data System (ADS)

    Lombardi, Damiano; Pant, Sanjay

    2016-01-01

    A nonparametric k -nearest-neighbor-based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering nonuniform probability densities in the region of k -nearest neighbors around each sample point. It aims to improve the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator, and a significant improvement is demonstrated.

  11. Nonparametric Bayesian evaluation of differential protein quantification

    PubMed Central

    Cansizoglu, A. Ertugrul; Käll, Lukas; Steen, Hanno

    2013-01-01

    Arbitrary cutoffs are ubiquitous in quantitative computational proteomics: maximum acceptable MS/MS PSM or peptide q–value, minimum ion intensity to calculate a fold change, the minimum number of peptides that must be available to trust the estimated protein fold change (or the minimum number of PSMs that must be available to trust the estimated peptide fold change), and the “significant” fold change cutoff. Here we introduce a novel experimental setup and nonparametric Bayesian algorithm for determining the statistical quality of a proposed differential set of proteins or peptides. By comparing putatively non-changing case-control evidence to an empirical null distribution derived from a control-control experiment, we successfully avoid some of these common parameters. We then apply our method to evaluating different fold change rules and find that, for our data, a 1.2-fold change is the most permissive of the plausible fold change rules. PMID:24024742

  12. Nonparametric inference of network structure and dynamics

    NASA Astrophysics Data System (ADS)

    Peixoto, Tiago P.

    The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among

  13. Covariance specification and estimation to improve top-down Green House Gas emission estimates

    NASA Astrophysics Data System (ADS)

    Ghosh, S.; Lopez-Coto, I.; Prasad, K.; Whetstone, J. R.

    2015-12-01

    accuracy, we perform a sensitivity study to further tune covariance parameters. Finally, we introduce a shrinkage based sample covariance estimation technique for both prior and mismatch covariances. This technique allows us to achieve similar accuracy nonparametrically in a more efficient and automated way.

  14. Essays in applied macroeconomics: Asymmetric price adjustment, exchange rate and treatment effect

    NASA Astrophysics Data System (ADS)

    Gu, Jingping

    This dissertation consists of three essays. Chapter II examines the possible asymmetric response of gasoline prices to crude oil price changes using an error correction model with GARCH errors. Recent papers have looked at this issue. Some of these papers estimate a form of error correction model, but none of them accounts for autoregressive heteroskedasticity in estimation and testing for asymmetry and none of them takes the response of crude oil price into consideration. We find that time-varying volatility of gasoline price disturbances is an important feature of the data, and when we allow for asymmetric GARCH errors and investigate the system wide impulse response function, we find evidence of asymmetric adjustment to crude oil price changes in weekly retail gasoline prices. Chapter III discusses the relationship between fiscal deficit and exchange rate. Economic theory predicts that fiscal deficits can significantly affect real exchange rate movements, but existing empirical evidence reports only a weak impact of fiscal deficits on exchange rates. Based on US dollar-based real exchange rates in G5 countries and a flexible varying coefficient model, we show that the previously documented weak relationship between fiscal deficits and exchange rates may be the result of additive specifications, and that the relationship is stronger if we allow fiscal deficits to impact real exchange rates non-additively as well as nonlinearly. We find that the speed of exchange rate adjustment toward equilibrium depends on the state of the fiscal deficit; a fiscal contraction in the US can lead to less persistence in the deviation of exchange rates from fundamentals, and faster mean reversion to the equilibrium. Chapter IV proposes a kernel method to deal with the nonparametric regression model with only discrete covariates as regressors. This new approach is based on recently developed least squares cross-validation kernel smoothing method. It can not only automatically smooth

  15. The incredible shrinking covariance estimator

    NASA Astrophysics Data System (ADS)

    Theiler, James

    2012-05-01

    Covariance estimation is a key step in many target detection algorithms. To distinguish target from background requires that the background be well-characterized. This applies to targets ranging from the precisely known chemical signatures of gaseous plumes to the wholly unspecified signals that are sought by anomaly detectors. When the background is modelled by a (global or local) Gaussian or other elliptically contoured distribution (such as Laplacian or multivariate-t), a covariance matrix must be estimated. The standard sample covariance overfits the data, and when the training sample size is small, the target detection performance suffers. Shrinkage addresses the problem of overfitting that inevitably arises when a high-dimensional model is fit from a small dataset. In place of the (overfit) sample covariance matrix, a linear combination of that covariance with a fixed matrix is employed. The fixed matrix might be the identity, the diagonal elements of the sample covariance, or some other underfit estimator. The idea is that the combination of an overfit with an underfit estimator can lead to a well-fit estimator. The coefficient that does this combining, called the shrinkage parameter, is generally estimated by some kind of cross-validation approach, but direct cross-validation can be computationally expensive. This paper extends an approach suggested by Hoffbeck and Landgrebe, and presents efficient approximations of the leave-one-out cross-validation (LOOC) estimate of the shrinkage parameter used in estimating the covariance matrix from a limited sample of data.

  16. Nonparametric regression of state occupation, entry, exit, and waiting times with multistate right-censored data.

    PubMed

    Mostajabi, Farida; Datta, Somnath

    2013-07-30

    We construct nonparametric regression estimators of a number of temporal functions in a multistate system based on a continuous univariate baseline covariate. These estimators include state occupation probabilities, state entry, exit, and waiting (sojourn) time distribution functions of a general progressive (e.g., acyclic) multistate model. We subject the data to right censoring, and the censoring mechanism is explainable by observable covariates that could be time dependent. The resulting estimators are valid even if the multistate process is non-Markov. We study the performance of the estimators in two simulation settings. We establish large sample consistency of these estimators. We illustrate our estimators using a data set on bone marrow transplant recipients. PMID:23225570

  17. Covariant Electrodynamics in Vacuum

    NASA Astrophysics Data System (ADS)

    Wilhelm, H. E.

    1990-05-01

    The generalized Galilei covariant Maxwell equations and their EM field transformations are applied to the vacuum electrodynamics of a charged particle moving with an arbitrary velocity v in an inertial frame with EM carrier (ether) of velocity w. In accordance with the Galilean relativity principle, all velocities have absolute meaning (relative to the ether frame with isotropic light propagation), and the relative velocity of two bodies is defined by the linear relation uG = v1 - v2. It is shown that the electric equipotential surfaces of a charged particle are compressed in the direction parallel to its relative velocity v - w (mechanism for physical length contraction of bodies). The magnetic field H(r, t) excited in the ether by a charge e moving uniformly with velocity v is related to its electric field E(r, t) by the equation H=ɛ0(v - w)xE/[ 1 +w • (t>- w)/c20], which shows that (i) a magnetic field is excited only if the charge moves relative to the ether, and (ii) the magnetic field is weak if v - w is not comparable to the velocity of light c0 . It is remarkable that a charged particle can excite EM shock waves in the ether if |i> - w > c0. This condition is realizable for anti-parallel charge and ether velocities if |v-w| > c0- | w|, i.e., even if |v| is subluminal. The possibility of this Cerenkov effect in the ether is discussed for terrestrial and galactic situations

  18. A semiparametric approach to simultaneous covariance estimation for bivariate sparse longitudinal data.

    PubMed

    Das, Kiranmoy; Daniels, Michael J

    2014-03-01

    Estimation of the covariance structure for irregular sparse longitudinal data has been studied by many authors in recent years but typically using fully parametric specifications. In addition, when data are collected from several groups over time, it is known that assuming the same or completely different covariance matrices over groups can lead to loss of efficiency and/or bias. Nonparametric approaches have been proposed for estimating the covariance matrix for regular univariate longitudinal data by sharing information across the groups under study. For the irregular case, with longitudinal measurements that are bivariate or multivariate, modeling becomes more difficult. In this article, to model bivariate sparse longitudinal data from several groups, we propose a flexible covariance structure via a novel matrix stick-breaking process for the residual covariance structure and a Dirichlet process mixture of normals for the random effects. Simulation studies are performed to investigate the effectiveness of the proposed approach over more traditional approaches. We also analyze a subset of Framingham Heart Study data to examine how the blood pressure trajectories and covariance structures differ for the patients from different BMI groups (high, medium, and low) at baseline. PMID:24400941

  19. A fuzzy, nonparametric segmentation framework for DTI and MRI analysis: with applications to DTI-tract extraction.

    PubMed

    Awate, Suyash P; Zhang, Hui; Gee, James C

    2007-11-01

    This paper presents a novel fuzzy-segmentation method for diffusion tensor (DT) and magnetic resonance (MR) images. Typical fuzzy-segmentation schemes, e.g., those based on fuzzy C means (FCM), incorporate Gaussian class models that are inherently biased towards ellipsoidal clusters characterized by a mean element and a covariance matrix. Tensors in fiber bundles, however, inherently lie on specific manifolds in Riemannian spaces. Unlike FCM-based schemes, the proposed method represents these manifolds using nonparametric data-driven statistical models. The paper describes a statistically-sound (consistent) technique for nonparametric modeling in Riemannian DT spaces. The proposed method produces an optimal fuzzy segmentation by maximizing a novel information-theoretic energy in a Markov-random-field framework. Results on synthetic and real, DT and MR images, show that the proposed method provides information about the uncertainties in the segmentation decisions, which stem from imaging artifacts including noise, partial voluming, and inhomogeneity. By enhancing the nonparametric model to capture the spatial continuity and structure of the fiber bundle, we exploit the framework to extract the cingulum fiber bundle. Typical tractography methods for tract delineation, incorporating thresholds on fractional anisotropy and fiber curvature to terminate tracking, can face serious problems arising from partial voluming and noise. For these reasons, tractography often fails to extract thin tracts with sharp changes in orientation, such as the cingulum. The results demonstrate that the proposed method extracts this structure significantly more accurately as compared to tractography. PMID:18041267

  20. Covariant Closed String Coherent States

    SciTech Connect

    Hindmarsh, Mark; Skliros, Dimitri

    2011-02-25

    We give the first construction of covariant coherent closed string states, which may be identified with fundamental cosmic strings. We outline the requirements for a string state to describe a cosmic string, and provide an explicit and simple map that relates three different descriptions: classical strings, light cone gauge quantum states, and covariant vertex operators. The resulting coherent state vertex operators have a classical interpretation and are in one-to-one correspondence with arbitrary classical closed string loops.

  1. Covariant closed string coherent states.

    PubMed

    Hindmarsh, Mark; Skliros, Dimitri

    2011-02-25

    We give the first construction of covariant coherent closed string states, which may be identified with fundamental cosmic strings. We outline the requirements for a string state to describe a cosmic string, and provide an explicit and simple map that relates three different descriptions: classical strings, light cone gauge quantum states, and covariant vertex operators. The resulting coherent state vertex operators have a classical interpretation and are in one-to-one correspondence with arbitrary classical closed string loops. PMID:21405564

  2. Nonparametric estimation of the rediscovery rate.

    PubMed

    Lee, Donghwan; Ganna, Andrea; Pawitan, Yudi; Lee, Woojoo

    2016-08-15

    Validation studies have been used to increase the reliability of the statistical conclusions for scientific discoveries; such studies improve the reproducibility of the findings and reduce the possibility of false positives. Here, one of the important roles of statistics is to quantify reproducibility rigorously. Two concepts were recently defined for this purpose: (i) rediscovery rate (RDR), which is the expected proportion of statistically significant findings in a study that can be replicated in the validation study and (ii) false discovery rate in the validation study (vFDR). In this paper, we aim to develop a nonparametric approach to estimate the RDR and vFDR and show an explicit link between the RDR and the FDR. Among other things, the link explains why reproducing statistically significant results even with low FDR level may be difficult. Two metabolomics datasets are considered to illustrate the application of the RDR and vFDR concepts in high-throughput data analysis. Copyright © 2016 John Wiley & Sons, Ltd. PMID:26910365

  3. Nonparametric methods in actigraphy: An update

    PubMed Central

    Gonçalves, Bruno S.B.; Cavalcanti, Paula R.A.; Tavares, Gracilene R.; Campos, Tania F.; Araujo, John F.

    2014-01-01

    Circadian rhythmicity in humans has been well studied using actigraphy, a method of measuring gross motor movement. As actigraphic technology continues to evolve, it is important for data analysis to keep pace with new variables and features. Our objective is to study the behavior of two variables, interdaily stability and intradaily variability, to describe rest activity rhythm. Simulated data and actigraphy data of humans, rats, and marmosets were used in this study. We modified the method of calculation for IV and IS by modifying the time intervals of analysis. For each variable, we calculated the average value (IVm and ISm) results for each time interval. Simulated data showed that (1) synchronization analysis depends on sample size, and (2) fragmentation is independent of the amplitude of the generated noise. We were able to obtain a significant difference in the fragmentation patterns of stroke patients using an IVm variable, while the variable IV60 was not identified. Rhythmic synchronization of activity and rest was significantly higher in young than adults with Parkinson׳s when using the ISM variable; however, this difference was not seen using IS60. We propose an updated format to calculate rhythmic fragmentation, including two additional optional variables. These alternative methods of nonparametric analysis aim to more precisely detect sleep–wake cycle fragmentation and synchronization. PMID:26483921

  4. Bayesian nonparametric regression with varying residual density.

    PubMed

    Pati, Debdeep; Dunson, David B

    2014-02-01

    We consider the problem of robust Bayesian inference on the mean regression function allowing the residual density to change flexibly with predictors. The proposed class of models is based on a Gaussian process prior for the mean regression function and mixtures of Gaussians for the collection of residual densities indexed by predictors. Initially considering the homoscedastic case, we propose priors for the residual density based on probit stick-breaking (PSB) scale mixtures and symmetrized PSB (sPSB) location-scale mixtures. Both priors restrict the residual density to be symmetric about zero, with the sPSB prior more flexible in allowing multimodal densities. We provide sufficient conditions to ensure strong posterior consistency in estimating the regression function under the sPSB prior, generalizing existing theory focused on parametric residual distributions. The PSB and sPSB priors are generalized to allow residual densities to change nonparametrically with predictors through incorporating Gaussian processes in the stick-breaking components. This leads to a robust Bayesian regression procedure that automatically down-weights outliers and influential observations in a locally-adaptive manner. Posterior computation relies on an efficient data augmentation exact block Gibbs sampler. The methods are illustrated using simulated and real data applications. PMID:24465053

  5. Non-parametric estimation of morphological lopsidedness

    NASA Astrophysics Data System (ADS)

    Giese, Nadine; van der Hulst, Thijs; Serra, Paolo; Oosterloo, Tom

    2016-09-01

    Asymmetries in the neutral hydrogen gas distribution and kinematics of galaxies are thought to be indicators for both gas accretion and gas removal processes. These are of fundamental importance for galaxy formation and evolution. Upcoming large blind H I surveys will provide tens of thousands of galaxies for a study of these asymmetries in a proper statistical way. Due to the large number of expected sources and the limited resolution of the majority of objects, detailed modelling is not feasible for most detections. We need fast, automatic and sensitive methods to classify these objects in an objective way. Existing non-parametric methods suffer from effects like the dependence on signal to noise, resolution and inclination. Here we show how to correctly take these effects into account and show ways to estimate the precision of the methods. We will use existing and modelled data to give an outlook on the performance expected for galaxies observed in the various sky surveys planned for e.g. WSRT/APERTIF and ASKAP.

  6. NONPARAMETRIC BAYESIAN ESTIMATION OF PERIODIC LIGHT CURVES

    SciTech Connect

    Wang Yuyang; Khardon, Roni; Protopapas, Pavlos

    2012-09-01

    Many astronomical phenomena exhibit patterns that have periodic behavior. An important step when analyzing data from such processes is the problem of identifying the period: estimating the period of a periodic function based on noisy observations made at irregularly spaced time points. This problem is still a difficult challenge despite extensive study in different disciplines. This paper makes several contributions toward solving this problem. First, we present a nonparametric Bayesian model for period finding, based on Gaussian Processes (GPs), that does not make assumptions on the shape of the periodic function. As our experiments demonstrate, the new model leads to significantly better results in period estimation especially when the light curve does not exhibit sinusoidal shape. Second, we develop a new algorithm for parameter optimization for GP which is useful when the likelihood function is very sensitive to the parameters with numerous local minima, as in the case of period estimation. The algorithm combines gradient optimization with grid search and incorporates several mechanisms to overcome the high computational complexity of GP. Third, we develop a novel approach for using domain knowledge, in the form of a probabilistic generative model, and incorporate it into the period estimation algorithm. Experimental results validate our approach showing significant improvement over existing methods.

  7. White matter microstructure from nonparametric axon diameter distribution mapping.

    PubMed

    Benjamini, Dan; Komlosh, Michal E; Holtzclaw, Lynne A; Nevo, Uri; Basser, Peter J

    2016-07-15

    We report the development of a double diffusion encoding (DDE) MRI method to estimate and map the axon diameter distribution (ADD) within an imaging volume. A variety of biological processes, ranging from development to disease and trauma, may lead to changes in the ADD in the central and peripheral nervous systems. Unlike previously proposed methods, this ADD experimental design and estimation framework employs a more general, nonparametric approach, without a priori assumptions about the underlying form of the ADD, making it suitable to analyze abnormal tissue. In the current study, this framework was used on an ex vivo ferret spinal cord, while emphasizing the way in which the ADD can be weighted by either the number or the volume of the axons. The different weightings, which result in different spatial contrasts, were considered throughout this work. DDE data were analyzed to derive spatially resolved maps of average axon diameter, ADD variance, and extra-axonal volume fraction, along with a novel sub-micron restricted structures map. The morphological information contained in these maps was then used to segment white matter into distinct domains by using a proposed k-means clustering algorithm with spatial contiguity and left-right symmetry constraints, resulting in identifiable white matter tracks. The method was validated by comparing histological measures to the estimated ADDs using a quantitative similarity metric, resulting in good agreement. With further acquisition acceleration and experimental parameters adjustments, this ADD estimation framework could be first used preclinically, and eventually clinically, enabling a wide range of neuroimaging applications for improved understanding of neurodegenerative pathologies and assessing microstructural changes resulting from trauma. PMID:27126002

  8. Evaluation of Tungsten Nuclear Reaction Data with Covariances

    SciTech Connect

    Trkov, A. Capote, R.; Kodeli, I.; Leal, L.

    2008-12-15

    As a follow-up of the work presented at the ND-2007 conference in Nice, additional fast reactor benchmarks were analyzed. Adjustment to the cross sections in the keV region was necessary. Evaluated neutron cross section data files for {sup 180,182,183,184,186}W isotopes were produced. Covariances were generated for all isotopes except {sup 180}W. In the resonance range the retro-active method was used. Above the resolved resonance range the covariance prior was generated by the Monte Carlo technique from nuclear model calculations with the Empire-II code. Experimental data were taken into account through the GANDR system using the generalized least-squares technique. Introducing experimental data results in relatively small changes in the cross sections, but greatly constrains the uncertainties. The covariance files are currently undergoing testing.

  9. Evaluation of Tungsten Nuclear Reaction Data with Covariances

    SciTech Connect

    Trkov, A.; Capote, R.; Kodeli, I.; Leal, Luiz C.

    2008-12-01

    As a follow-up of the work presented at the ND-2007 conference in Nice, additional fast reactor benchmarks were analyzed. Adjustment to the cross sections in the keV region was necessary. Evaluated neutron cross section data files for 180,182,183,184,186W isotopes were produced. Covariances were generated for all isotopes except 180W. In the resonance range the retro-active method was used. Above the resolved resonance range the covariance prior was generated by the Monte Carlo technique from nuclear model calculations with the Empire-II code. Experimental data were taken into account through the GANDR system using the generalized least-squares technique. Introducing experimental data results in relatively small changes in the cross sections, but greatly constrains the uncertainties. The covariance files are currently undergoing testing.

  10. Shrinkage estimators for covariance matrices.

    PubMed

    Daniels, M J; Kass, R E

    2001-12-01

    Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically

  11. A nonparametric spatial model for periodontal data with non-random missingness.

    PubMed

    Reich, Brian J; Bandyopadhyay, Dipankar; Bondell, Howard D

    2013-09-01

    Periodontal disease progression is often quantified by clinical attachment level (CAL) defined as the distance down a tooth's root that is detached from the surrounding bone. Measured at 6 locations per tooth throughout the mouth (excluding the molars), it gives rise to a dependent data set-up. These data are often reduced to a one-number summary, such as the whole mouth average or the number of observations greater than a threshold, to be used as the response in a regression to identify important covariates related to the current state of a subject's periodontal health. Rather than a simple one-number summary, we set forward to analyze all available CAL data for each subject, exploiting the presence of spatial dependence, non-stationarity, and non-normality. Also, many subjects have a considerable proportion of missing teeth which cannot be considered missing at random because periodontal disease is the leading cause of adult tooth loss. Under a Bayesian paradigm, we propose a nonparametric flexible spatial (joint) model of observed CAL and the location of missing tooth via kernel convolution methods, incorporating the aforementioned features of CAL data under a unified framework. Application of this methodology to a data set recording the periodontal health of an African-American population, as well as simulation studies reveal the gain in model fit and inference, and provides a new perspective into unraveling covariate-response relationships in presence of complexities posed by these data. PMID:24288421

  12. Unveiling acoustic physics of the CMB using nonparametric estimation of the temperature angular power spectrum for Planck

    SciTech Connect

    Aghamousa, Amir; Shafieloo, Arman; Arjunwadkar, Mihir; Souradeep, Tarun E-mail: shafieloo@kasi.re.kr E-mail: tarun@iucaa.ernet.in

    2015-02-01

    Estimation of the angular power spectrum is one of the important steps in Cosmic Microwave Background (CMB) data analysis. Here, we present a nonparametric estimate of the temperature angular power spectrum for the Planck 2013 CMB data. The method implemented in this work is model-independent, and allows the data, rather than the model, to dictate the fit. Since one of the main targets of our analysis is to test the consistency of the ΛCDM model with Planck 2013 data, we use the nuisance parameters associated with the best-fit ΛCDM angular power spectrum to remove foreground contributions from the data at multipoles ℓ ≥50. We thus obtain a combined angular power spectrum data set together with the full covariance matrix, appropriately weighted over frequency channels. Our subsequent nonparametric analysis resolves six peaks (and five dips) up to ℓ ∼1850 in the temperature angular power spectrum. We present uncertainties in the peak/dip locations and heights at the 95% confidence level. We further show how these reflect the harmonicity of acoustic peaks, and can be used for acoustic scale estimation. Based on this nonparametric formalism, we found the best-fit ΛCDM model to be at 36% confidence distance from the center of the nonparametric confidence set—this is considerably larger than the confidence distance (9%) derived earlier from a similar analysis of the WMAP 7-year data. Another interesting result of our analysis is that at low multipoles, the Planck data do not suggest any upturn, contrary to the expectation based on the integrated Sachs-Wolfe contribution in the best-fit ΛCDM cosmology.

  13. AFCI-2.0 Neutron Cross Section Covariance Library

    SciTech Connect

    Herman, M.; Herman, M; Oblozinsky, P.; Mattoon, C.M.; Pigni, M.; Hoblit, S.; Mughabghab, S.F.; Sonzogni, A.; Talou, P.; Chadwick, M.B.; Hale, G.M.; Kahler, A.C.; Kawano, T.; Little, R.C.; Yount, P.G.

    2011-03-01

    The cross section covariance library has been under development by BNL-LANL collaborative effort over the last three years. The project builds on two covariance libraries developed earlier, with considerable input from BNL and LANL. In 2006, international effort under WPEC Subgroup 26 produced BOLNA covariance library by putting together data, often preliminary, from various sources for most important materials for nuclear reactor technology. This was followed in 2007 by collaborative effort of four US national laboratories to produce covariances, often of modest quality - hence the name low-fidelity, for virtually complete set of materials included in ENDF/B-VII.0. The present project is focusing on covariances of 4-5 major reaction channels for 110 materials of importance for power reactors. The work started under Global Nuclear Energy Partnership (GNEP) in 2008, which changed to Advanced Fuel Cycle Initiative (AFCI) in 2009. With the 2011 release the name has changed to the Covariance Multigroup Matrix for Advanced Reactor Applications (COMMARA) version 2.0. The primary purpose of the library is to provide covariances for AFCI data adjustment project, which is focusing on the needs of fast advanced burner reactors. Responsibility of BNL was defined as developing covariances for structural materials and fission products, management of the library and coordination of the work; LANL responsibility was defined as covariances for light nuclei and actinides. The COMMARA-2.0 covariance library has been developed by BNL-LANL collaboration for Advanced Fuel Cycle Initiative applications over the period of three years, 2008-2010. It contains covariances for 110 materials relevant to fast reactor R&D. The library is to be used together with the ENDF/B-VII.0 central values of the latest official release of US files of evaluated neutron cross sections. COMMARA-2.0 library contains neutron cross section covariances for 12 light nuclei (coolants and moderators), 78 structural

  14. Network Reconstruction Using Nonparametric Additive ODE Models

    PubMed Central

    Henderson, James; Michailidis, George

    2014-01-01

    Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative

  15. Adjustment disorder

    MedlinePlus

    American Psychiatric Association. Diagnostic and statistical manual of mental disorders. 5th ed. Arlington, Va: American Psychiatric Publishing. 2013. Powell AD. Grief, bereavement, and adjustment disorders. In: Stern TA, Rosenbaum ...

  16. Fully Bayesian inference under ignorable missingness in the presence of auxiliary covariates

    PubMed Central

    Daniels, M.J.; Wang, C.; Marcus, B.H.

    2014-01-01

    In order to make a missing at random (MAR) or ignorability assumption realistic, auxiliary covariates are often required. However, the auxiliary covariates are not desired in the model for inference. Typical multiple imputation approaches do not assume that the imputation model marginalizes to the inference model. This has been termed ‘uncongenial’ (Meng, 1994). In order to make the two models congenial (or compatible), we would rather not assume a parametric model for the marginal distribution of the auxiliary covariates, but we typically do not have enough data to estimate the joint distribution well non-parametrically. In addition, when the imputation model uses a non-linear link function (e.g., the logistic link for a binary response), the marginalization over the auxiliary covariates to derive the inference model typically results in a difficult to interpret form for effect of covariates. In this article, we propose a fully Bayesian approach to ensure that the models are compatible for incomplete longitudinal data by embedding an interpretable inference model within an imputation model and that also addresses the two complications described above. We evaluate the approach via simulations and implement it on a recent clinical trial. PMID:24571539

  17. A class of covariate-dependent spatiotemporal covariance functions

    PubMed Central

    Reich, Brian J; Eidsvik, Jo; Guindani, Michele; Nail, Amy J; Schmidt, Alexandra M.

    2014-01-01

    In geostatistics, it is common to model spatially distributed phenomena through an underlying stationary and isotropic spatial process. However, these assumptions are often untenable in practice because of the influence of local effects in the correlation structure. Therefore, it has been of prolonged interest in the literature to provide flexible and effective ways to model non-stationarity in the spatial effects. Arguably, due to the local nature of the problem, we might envision that the correlation structure would be highly dependent on local characteristics of the domain of study, namely the latitude, longitude and altitude of the observation sites, as well as other locally defined covariate information. In this work, we provide a flexible and computationally feasible way for allowing the correlation structure of the underlying processes to depend on local covariate information. We discuss the properties of the induced covariance functions and discuss methods to assess its dependence on local covariate information by means of a simulation study and the analysis of data observed at ozone-monitoring stations in the Southeast United States. PMID:24772199

  18. Rediscovery of Good-Turing estimators via Bayesian nonparametrics.

    PubMed

    Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye

    2016-03-01

    The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. PMID:26224325

  19. Are Eddy Covariance series stationary?

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Spectral analysis via a discrete Fourier transform is used often to examine eddy covariance series for cycles (eddies) of interest. Generally the analysis is performed on hourly or half-hourly data sets collected at 10 or 20 Hz. Each original series is often assumed to be stationary. Also automated ...

  20. Covariance Modifications to Subspace Bases

    SciTech Connect

    Harris, D B

    2008-11-19

    Adaptive signal processing algorithms that rely upon representations of signal and noise subspaces often require updates to those representations when new data become available. Subspace representations frequently are estimated from available data with singular value (SVD) decompositions. Subspace updates require modifications to these decompositions. Updates can be performed inexpensively provided they are low-rank. A substantial literature on SVD updates exists, frequently focusing on rank-1 updates (see e.g. [Karasalo, 1986; Comon and Golub, 1990, Badeau, 2004]). In these methods, data matrices are modified by addition or deletion of a row or column, or data covariance matrices are modified by addition of the outer product of a new vector. A recent paper by Brand [2006] provides a general and efficient method for arbitrary rank updates to an SVD. The purpose of this note is to describe a closely-related method for applications where right singular vectors are not required. This note also describes the SVD updates to a particular scenario of interest in seismic array signal processing. The particular application involve updating the wideband subspace representation used in seismic subspace detectors [Harris, 2006]. These subspace detectors generalize waveform correlation algorithms to detect signals that lie in a subspace of waveforms of dimension d {ge} 1. They potentially are of interest because they extend the range of waveform variation over which these sensitive detectors apply. Subspace detectors operate by projecting waveform data from a detection window into a subspace specified by a collection of orthonormal waveform basis vectors (referred to as the template). Subspace templates are constructed from a suite of normalized, aligned master event waveforms that may be acquired by a single sensor, a three-component sensor, an array of such sensors or a sensor network. The template design process entails constructing a data matrix whose columns contain the

  1. Graph embedded nonparametric mutual information for supervised dimensionality reduction.

    PubMed

    Bouzas, Dimitrios; Arvanitopoulos, Nikolaos; Tefas, Anastasios

    2015-05-01

    In this paper, we propose a novel algorithm for dimensionality reduction that uses as a criterion the mutual information (MI) between the transformed data and their corresponding class labels. The MI is a powerful criterion that can be used as a proxy to the Bayes error rate. Furthermore, recent quadratic nonparametric implementations of MI are computationally efficient and do not require any prior assumptions about the class densities. We show that the quadratic nonparametric MI can be formulated as a kernel objective in the graph embedding framework. Moreover, we propose its linear equivalent as a novel linear dimensionality reduction algorithm. The derived methods are compared against the state-of-the-art dimensionality reduction algorithms with various classifiers and on various benchmark and real-life datasets. The experimental results show that nonparametric MI as an optimization objective for dimensionality reduction gives comparable and in most of the cases better results compared with other dimensionality reduction methods. PMID:25881367

  2. Mathematical models for nonparametric inferences from line transect data

    USGS Publications Warehouse

    Burnham, K.P.; Anderson, D.R.

    1976-01-01

    A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(O) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y/r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(O/r).

  3. Predicting Market Impact Costs Using Nonparametric Machine Learning Models.

    PubMed

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235

  4. Predicting Market Impact Costs Using Nonparametric Machine Learning Models

    PubMed Central

    Park, Saerom; Lee, Jaewook; Son, Youngdoo

    2016-01-01

    Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235

  5. A new approach to modeling covariate effects and individualization in population pharmacokinetics-pharmacodynamics.

    PubMed

    Lai, Tze Leung; Shih, Mei-Chiung; Wong, Samuel P

    2006-02-01

    By combining Laplace's approximation and Monte Carlo methods to evaluate multiple integrals, this paper develops a new approach to estimation in nonlinear mixed effects models that are widely used in population pharmacokinetics and pharmacodynamics. Estimation here involves not only estimating the model parameters from Phase I and II studies but also using the fitted model to estimate the concentration versus time curve or the drug effects of a subject who has covariate information but sparse measurements. Because of its computational tractability, the proposed approach can model the covariate effects nonparametrically by using (i) regression splines or neural networks as basis functions and (ii) AIC or BIC for model selection. Its computational and statistical advantages are illustrated in simulation studies and in Phase I trials. PMID:16402288

  6. Adjustable microforceps.

    PubMed

    Bao, J Y

    1991-04-01

    The commonly used microforceps have a much greater opening distance and spring resistance than needed. A piece of plastic ring or rubber band can be used to adjust the opening distance and reduce most of the spring resistance, making the user feel more comfortable and less fatigued. PMID:2051437

  7. Neutron Cross Section Covariances for Structural Materials and Fission Products

    SciTech Connect

    Hoblit, S.; Hoblit,S.; Cho,Y.-S.; Herman,M.; Mattoon,C.M.; Mughabghab,S.F.; Oblozinsky,P.; Pigni,M.T.; Sonzogni,A.A.

    2011-12-01

    We describe neutron cross section covariances for 78 structural materials and fission products produced for the new US evaluated nuclear reaction library ENDF/B-VII.1. Neutron incident energies cover full range from 10{sup -5} eV to 20 MeV and covariances are primarily provided for capture, elastic and inelastic scattering as well as (n,2n). The list of materials follows priorities defined by the Advanced Fuel Cycle Initiative, the major application being data adjustment for advanced fast reactor systems. Thus, in addition to 28 structural materials and 49 fission products, the list includes also {sup 23}Na which is important fast reactor coolant. Due to extensive amount of materials, we adopted a variety of methodologies depending on the priority of a specific material. In the resolved resonance region we primarily used resonance parameter uncertainties given in Atlas of Neutron Resonances and either applied the kernel approximation to propagate these uncertainties into cross section uncertainties or resorted to simplified estimates based on integral quantities. For several priority materials we adopted MF32 covariances produced by SAMMY at ORNL, modified by us by adding MF33 covariances to account for systematic uncertainties. In the fast neutron region we resorted to three methods. The most sophisticated was EMPIRE-KALMAN method which combines experimental data from EXFOR library with nuclear reaction modeling and least-squares fitting. The two other methods used simplified estimates, either based on the propagation of nuclear reaction model parameter uncertainties or on a dispersion analysis of central cross section values in recent evaluated data files. All covariances were subject to quality assurance procedures adopted recently by CSEWG. In addition, tools were developed to allow inspection of processed covariances and computed integral quantities, and for comparing these values to data from the Atlas and the astrophysics database KADoNiS.

  8. Neutron Cross Section Covariances for Structural Materials and Fission Products

    NASA Astrophysics Data System (ADS)

    Hoblit, S.; Cho, Y.-S.; Herman, M.; Mattoon, C. M.; Mughabghab, S. F.; Obložinský, P.; Pigni, M. T.; Sonzogni, A. A.

    2011-12-01

    We describe neutron cross section covariances for 78 structural materials and fission products produced for the new US evaluated nuclear reaction library ENDF/B-VII.1. Neutron incident energies cover full range from 10 eV to 20 MeV and covariances are primarily provided for capture, elastic and inelastic scattering as well as (n,2n). The list of materials follows priorities defined by the Advanced Fuel Cycle Initiative, the major application being data adjustment for advanced fast reactor systems. Thus, in addition to 28 structural materials and 49 fission products, the list includes also 23Na which is important fast reactor coolant. Due to extensive amount of materials, we adopted a variety of methodologies depending on the priority of a specific material. In the resolved resonance region we primarily used resonance parameter uncertainties given in Atlas of Neutron Resonances and either applied the kernel approximation to propagate these uncertainties into cross section uncertainties or resorted to simplified estimates based on integral quantities. For several priority materials we adopted MF32 covariances produced by SAMMY at ORNL, modified by us by adding MF33 covariances to account for systematic uncertainties. In the fast neutron region we resorted to three methods. The most sophisticated was EMPIRE-KALMAN method which combines experimental data from EXFOR library with nuclear reaction modeling and least-squares fitting. The two other methods used simplified estimates, either based on the propagation of nuclear reaction model parameter uncertainties or on a dispersion analysis of central cross section values in recent evaluated data files. All covariances were subject to quality assurance procedures adopted recently by CSEWG. In addition, tools were developed to allow inspection of processed covariances and computed integral quantities, and for comparing these values to data from the Atlas and the astrophysics database KADoNiS.

  9. Covariate analysis of survival data: a small-sample study of Cox's model

    SciTech Connect

    Johnson, M.E.; Tolley, H.D.; Bryson, M.C.; Goldman, A.S.

    1982-09-01

    Cox's proportional-hazards model is frequently used to adjust for covariate effects in survival-data analysis. The small-sample performances of the maximum partial likelihood estimators of the regression parameters in a two-covariate hazard function model are evaluated with respect to bias, variance, and power in hypothesis tests. Previous Monte Carlo work on the two-sample problem is reviewed.

  10. Minimal unitary (covariant) scattering theory

    SciTech Connect

    Lindesay, J.V.; Markevich, A.

    1983-06-01

    In the minimal three particle equations developed by Lindesay the two body input amplitude was an on shell relativistic generalization of the non-relativistic scattering model characterized by a single mass parameter ..mu.. which in the two body (m + m) system looks like an s-channel bound state (..mu.. < 2m) or virtual state (..mu.. > 2m). Using this driving term in covariant Faddeev equations generates a rich covariant and unitary three particle dynamics. However, the simplest way of writing the relativisitic generalization of the Faddeev equations can take the on shell Mandelstam parameter s = 4(q/sup 2/ + m/sup 2/), in terms of which the two particle input is expressed, to negative values in the range of integration required by the dynamics. This problem was met in the original treatment by multiplying the two particle input amplitude by THETA(s). This paper provides what we hope to be a more direct way of meeting the problem.

  11. Determination of Resonance Parameters and their Covariances from Neutron Induced Reaction Cross Section Data

    SciTech Connect

    Schillebeeckx, P.; Becker, B.; Danon, Y.; Guber, K.; Harada, H.; Heyse, J.; Junghans, A.R.; Kopecky, S.; Massimi, C.; Moxon, M.C.; Otuka, N.; Sirakov, I.; Volev, K.

    2012-12-15

    Cross section data in the resolved and unresolved resonance region are represented by nuclear reaction formalisms using parameters which are determined by fitting them to experimental data. Therefore, the quality of evaluated cross sections in the resonance region strongly depends on the experimental data used in the adjustment process and an assessment of the experimental covariance data is of primary importance in determining the accuracy of evaluated cross section data. In this contribution, uncertainty components of experimental observables resulting from total and reaction cross section experiments are quantified by identifying the metrological parameters involved in the measurement, data reduction and analysis process. In addition, different methods that can be applied to propagate the covariance of the experimental observables (i.e. transmission and reaction yields) to the covariance of the resonance parameters are discussed and compared. The methods being discussed are: conventional uncertainty propagation, Monte Carlo sampling and marginalization. It is demonstrated that the final covariance matrix of the resonance parameters not only strongly depends on the type of experimental observables used in the adjustment process, the experimental conditions and the characteristics of the resonance structure, but also on the method that is used to propagate the covariances. Finally, a special data reduction concept and format is presented, which offers the possibility to store the full covariance information of experimental data in the EXFOR library and provides the information required to perform a full covariance evaluation.

  12. Covariant jump conditions in electromagnetism

    NASA Astrophysics Data System (ADS)

    Itin, Yakov

    2012-02-01

    A generally covariant four-dimensional representation of Maxwell's electrodynamics in a generic material medium can be achieved straightforwardly in the metric-free formulation of electromagnetism. In this setup, the electromagnetic phenomena are described by two tensor fields, which satisfy Maxwell's equations. A generic tensorial constitutive relation between these fields is an independent ingredient of the theory. By use of different constitutive relations (local and non-local, linear and non-linear, etc.), a wide area of applications can be covered. In the current paper, we present the jump conditions for the fields and for the energy-momentum tensor on an arbitrarily moving surface between two media. From the differential and integral Maxwell equations, we derive the covariant boundary conditions, which are independent of any metric and connection. These conditions include the covariantly defined surface current and are applicable to an arbitrarily moving smooth curved boundary surface. As an application of the presented jump formulas, we derive a Lorentzian type metric as a condition for existence of the wave front in isotropic media. This result holds for ordinary materials as well as for metamaterials with negative material constants.

  13. Shaft adjuster

    DOEpatents

    Harry, Herbert H.

    1989-01-01

    Apparatus and method for the adjustment and alignment of shafts in high power devices. A plurality of adjacent rotatable angled cylinders are positioned between a base and the shaft to be aligned which when rotated introduce an axial offset. The apparatus is electrically conductive and constructed of a structurally rigid material. The angled cylinders allow the shaft such as the center conductor in a pulse line machine to be offset in any desired alignment position within the range of the apparatus.

  14. Covariance matrices for use in criticality safety predictability studies

    SciTech Connect

    Derrien, H.; Larson, N.M.; Leal, L.C.

    1997-09-01

    Criticality predictability applications require as input the best available information on fissile and other nuclides. In recent years important work has been performed in the analysis of neutron transmission and cross-section data for fissile nuclei in the resonance region by using the computer code SAMMY. The code uses Bayes method (a form of generalized least squares) for sequential analyses of several sets of experimental data. Values for Reich-Moore resonance parameters, their covariances, and the derivatives with respect to the adjusted parameters (data sensitivities) are obtained. In general, the parameter file contains several thousand values and the dimension of the covariance matrices is correspondingly large. These matrices are not reported in the current evaluated data files due to their large dimensions and to the inadequacy of the file formats. The present work has two goals: the first is to calculate the covariances of group-averaged cross sections from the covariance files generated by SAMMY, because these can be more readily utilized in criticality predictability calculations. The second goal is to propose a more practical interface between SAMMY and the evaluated files. Examples are given for {sup 235}U in the popular 199- and 238-group structures, using the latest ORNL evaluation of the {sup 235}U resonance parameters.

  15. Surface Estimation, Variable Selection, and the Nonparametric Oracle Property

    PubMed Central

    Storlie, Curtis B.; Bondell, Howard D.; Reich, Brian J.; Zhang, Hao Helen

    2010-01-01

    Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting. PMID:21603586

  16. A Unifying Framework for Teaching Nonparametric Statistical Tests

    ERIC Educational Resources Information Center

    Bargagliotti, Anna E.; Orrison, Michael E.

    2014-01-01

    Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…

  17. Coefficients of Association Analogous to Pearson's r for Nonparametric Statistics.

    ERIC Educational Resources Information Center

    Stavig, Gordon; Acock, Alan C.

    1980-01-01

    Two r coefficients of association are discussed. One of the coefficients can be applied to any nonparametric test statistic (NTS) in which a normal approximation equation is appropriate. The other coefficient is applicable to any NTS in which exact probabilities are known. (Author/RL)

  18. Three Classes of Nonparametric Differential Step Functioning Effect Estimators

    ERIC Educational Resources Information Center

    Penfield, Randall D.

    2008-01-01

    The examination of measurement invariance in polytomous items is complicated by the possibility that the magnitude and sign of lack of invariance may vary across the steps underlying the set of polytomous response options, a concept referred to as differential step functioning (DSF). This article describes three classes of nonparametric DSF effect…

  19. Nonparametric identification of petrogenic and pyrogenic hydrocarbons in aquatic ecosystems.

    PubMed

    Carls, Mark G

    2006-07-01

    Novel nonparametric models developed herein discriminated between oiled and nonoiled or pyrogenic and oiled sources better than traditionally used diagnostic ratios and can outperform previously published oil identification models. These methods were compared using experimental and environmental hydrocarbon data (sediment, mussels, water, and fish) associated with the Exxon Valdez oil spill. Several nonparametric models were investigated, one designed to detect petroleum in general, one specific to Alaska North Slope crude oil (ANS), and one designed to detect pyrogenic PAH. These ideas are intended as guidance; nonparametric models can easily be adapted to fit the specific needs of a variety of petrogenic and pyrogenic sources. Oil identification was clearly difficult where composition was modified by physical or biological processes; model results differed most in these cases, suggesting that a multiple model approach to source discrimination may be useful where data interpretation is contentious. However, a combined nonparametric model best described a broad range of hydrocarbon sources, thus providing a useful new analytical assessment tool. PMID:16856740

  20. Nonparametric Item Response Curve Estimation with Correction for Measurement Error

    ERIC Educational Resources Information Center

    Guo, Hongwen; Sinharay, Sandip

    2011-01-01

    Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…

  1. A Simulation Comparison of Parametric and Nonparametric Dimensionality Detection Procedures

    ERIC Educational Resources Information Center

    Mroch, Andrew A.; Bolt, Daniel M.

    2006-01-01

    Recently, nonparametric methods have been proposed that provide a dimensionally based description of test structure for tests with dichotomous items. Because such methods are based on different notions of dimensionality than are assumed when using a psychometric model, it remains unclear whether these procedures might lead to a different…

  2. Estimation of Spatial Dynamic Nonparametric Durbin Models with Fixed Effects

    ERIC Educational Resources Information Center

    Qian, Minghui; Hu, Ridong; Chen, Jianwei

    2016-01-01

    Spatial panel data models have been widely studied and applied in both scientific and social science disciplines, especially in the analysis of spatial influence. In this paper, we consider the spatial dynamic nonparametric Durbin model (SDNDM) with fixed effects, which takes the nonlinear factors into account base on the spatial dynamic panel…

  3. Statistical Evidence in Salary Discrimination Studies: Nonparametric Inferential Conditions.

    ERIC Educational Resources Information Center

    Millsap, Roger E.; Meredith, William

    1994-01-01

    Theoretical nonparametric conditions under which evidence from salary studies using observed merit measures can provide a basis for inferences of fairness are discussed. Latent variable models as parametric special cases of the general conditions presented are illustrated with real salary data. Implications for empirical studies of salary equity…

  4. Adjusted Rasch person-fit statistics.

    PubMed

    Dimitrov, Dimiter M; Smith, Richard M

    2006-01-01

    Two frequently used parametric statistics of person-fit with the dichotomous Rasch model (RM) are adjusted and compared to each other and to their original counterparts in terms of power to detect aberrant response patterns in short tests (10, 20, and 30 items). Specifically, the cube root transformation of the mean square for the unweighted person-fit statistic, t, and the standardized likelihood-based person-fit statistic Z3 were adjusted by estimating the probability for correct item response through the use of symmetric functions in the dichotomous Rasch model. The results for simulated unidimensional Rasch data indicate that t and Z3 are consistently, yet not greatly, outperformed by their adjusted counterparts, denoted t* and Z3*, respectively. The four parametric statistics, t, Z3, t*, and Z3*, were also compared to a non-parametric statistic, HT, identified in recent research as outperforming numerous parametric and non-parametric person-fit statistics. The results show that HT substantially outperforms t, Z3, t*, and Z3* in detecting aberrant response patterns for 20-item and 30-item tests, but not for very short tests of 10 items. The detection power of t, Z3, t*, and Z3*, and HT at two specific levels of Type I error, .10 and .05 (i.e., up to 10% and 5% false alarm rate, respectively), is also reported. PMID:16632900

  5. Robust estimation for partially linear models with large-dimensional covariates

    PubMed Central

    Zhu, LiPing; Li, RunZe; Cui, HengJian

    2014-01-01

    We are concerned with robust estimation procedures to estimate the parameters in partially linear models with large-dimensional covariates. To enhance the interpretability, we suggest implementing a noncon-cave regularization method in the robust estimation procedure to select important covariates from the linear component. We establish the consistency for both the linear and the nonlinear components when the covariate dimension diverges at the rate of o(n), where n is the sample size. We show that the robust estimate of linear component performs asymptotically as well as its oracle counterpart which assumes the baseline function and the unimportant covariates were known a priori. With a consistent estimator of the linear component, we estimate the nonparametric component by a robust local linear regression. It is proved that the robust estimate of nonlinear component performs asymptotically as well as if the linear component were known in advance. Comprehensive simulation studies are carried out and an application is presented to examine the finite-sample performance of the proposed procedures. PMID:24955087

  6. Topics in data adjustment theory and applications

    SciTech Connect

    Hwang, R.N.

    1988-01-01

    The methodologies of the uncertainty analysis and data adjustment have been well-developed and widely used abroad since the early 70's. With limited amount of covariance data on the differential cross section and the integral experiments available at the time, their accomplishments are, indeed, astounding. The fundamental adjustment equations, however, remain qualitatively unchanged. For the past few year, extensive efforts on these subjects have also begun at ANL in order to utilize the massive amount of integral experiments accumulated over years to provide the basis for improving the reactor parameters encountered in various design calculations. Pertinent covariance matrices and sensitivity matrices of the existing integral experiments have been evaluated and systematically compiled in the data files along with the cross section covariance data derived from the ENDF-B/V for the 21 group structure currently under consideration. A production code GMADJ that provides the adjusted quantities for a large number of cross section types has been developed by Poenitz for routine applications. The primary purpose of the present paper is to improve understanding of the application oriented issues important to the data adjustment theory and the subsequent usage of the adjusted quantities in the design calculations in support of these activities. 30 refs., 12 figs., 5 tabs.

  7. Covariance Evaluation Methodology for Neutron Cross Sections

    SciTech Connect

    Herman,M.; Arcilla, R.; Mattoon, C.M.; Mughabghab, S.F.; Oblozinsky, P.; Pigni, M.; Pritychenko, b.; Songzoni, A.A.

    2008-09-01

    We present the NNDC-BNL methodology for estimating neutron cross section covariances in thermal, resolved resonance, unresolved resonance and fast neutron regions. The three key elements of the methodology are Atlas of Neutron Resonances, nuclear reaction code EMPIRE, and the Bayesian code implementing Kalman filter concept. The covariance data processing, visualization and distribution capabilities are integral components of the NNDC methodology. We illustrate its application on examples including relatively detailed evaluation of covariances for two individual nuclei and massive production of simple covariance estimates for 307 materials. Certain peculiarities regarding evaluation of covariances for resolved resonances and the consistency between resonance parameter uncertainties and thermal cross section uncertainties are also discussed.

  8. Connecting Math and Motion: A Covariational Approach

    NASA Astrophysics Data System (ADS)

    Culbertson, Robert J.; Thompson, A. S.

    2006-12-01

    We define covariational reasoning as the ability to correlate changes in two connected variables. For example, the ability to describe the height of fluid in an odd-shaped vessel as a function of fluid volume requires covariational reasoning skills. Covariational reasoning ability is an essential resource for gaining a deep understanding of the physics of motion. We have developed an approach for teaching physical science to in-service math and science high school teachers that emphasizes covariational reasoning. Several examples of covariation and results from a small cohort of local teachers will be presented.

  9. A Nonparametric Simulator for Multivariate Random Variables with Differing Marginal Densities and Non-linear Dependence with Hydroclimatic Applications

    NASA Astrophysics Data System (ADS)

    Farnham, D.; Lall, U.; Devineni, N.; Rahill-Marier, B.

    2013-12-01

    Hydrologic models often require as inputs stochastic simulations of meteorological variables that are mutually consistent and spatially coherent, i.e., have marginal and joint probability densities that correspond to those estimated from physically realized states, and have the appropriate spatial structure. These inputs may come from historical meteorological data, or from relatively small ensembles of integrations of numerical climate and weather models. Often, empirical modeling or simulation of multiple hydroclimatic variables, or simulations of hydrologic variables at multiple sites that respect the spatial co-variability may also be desired. A nonparametric simulation strategy is presented that is capable of 1) addressing marginal probability density functions that are different for each variable of interest and 2) reproducing the joint probability distribution across a potentially large set of variables or spatial instances. The application of rainfall simulations developed from historic rain gauge and radar data is explored. Such simulations are useful for urban hydrological modelers seeking more spatially resolved precipitation forcings.

  10. Parametric approaches to quality-adjusted survival analysis. International Breast Cancer Study Group.

    PubMed

    Cole, B F; Gelber, R D; Anderson, K M

    1994-09-01

    We present a parametric methodology for performing quality-of-life-adjusted survival analysis using multivariate censored survival data. It represents a generalization of the nonparametric Q-TWiST method (Quality-adjusted Time without Symptoms and Toxicity). The event times correspond to transitions between states of health that differ in terms of quality of life. Each transition is governed by a competing risks model where the health states are the competing risks. Overall survival is the sum of the amount of time spent in each health state. The first step of the proposed methodology consists of defining a quality function that assigns a "score" to a life having given health state transitions. It is a composite measure of both quantity and quality of life. In general, the quality function assigns a small value to a short life with poor quality and a high value to a long life with good quality. In the second step, parametric survival models are fit to the data. This is done by repeatedly modeling the conditional cause-specific hazard functions given the previous transitions. Covariates are incorporated by accelerated failure time regression, and the model parameters are estimated by maximum likelihood. Lastly, the modeling results are used to estimate the expectation of quality functions. Standard errors and confidence intervals are computed using the bootstrap and delta methods. The results are useful for simultaneously evaluating treatments in terms of quantity and quality of life. To demonstrate the proposed methods, we perform an analysis of data from the International Breast Cancer Study Group Trial V, which compared short-duration chemotherapy versus long-duration chemotherapy in the treatment of node-positive breast cancer. The events studied are: (1) the end of treatment toxicity, (2) disease recurrence, and (3) overall survival. PMID:7981389

  11. Phase-covariant quantum benchmarks

    SciTech Connect

    Calsamiglia, J.; Aspachs, M.; Munoz-Tapia, R.; Bagan, E.

    2009-05-15

    We give a quantum benchmark for teleportation and quantum storage experiments suited for pure and mixed test states. The benchmark is based on the average fidelity over a family of phase-covariant states and certifies that an experiment cannot be emulated by a classical setup, i.e., by a measure-and-prepare scheme. We give an analytical solution for qubits, which shows important differences with standard state estimation approach, and compute the value of the benchmark for coherent and squeezed states, both pure and mixed.

  12. Nonparametric estimation receiver operating characteristic analysis for performance evaluation on combined detection and estimation tasks.

    PubMed

    Wunderlich, Adam; Goossens, Bart

    2014-10-01

    In an effort to generalize task-based assessment beyond traditional signal detection, there is a growing interest in performance evaluation for combined detection and estimation tasks, in which signal parameters, such as size, orientation, and contrast are unknown and must be estimated. One motivation for studying such tasks is their rich complexity, which offers potential advantages for imaging system optimization. To evaluate observer performance on combined detection and estimation tasks, Clarkson introduced the estimation receiver operating characteristic (EROC) curve and the area under the EROC curve as a summary figure of merit. This work provides practical tools for EROC analysis of experimental data. In particular, we propose nonparametric estimators for the EROC curve, the area under the EROC curve, and for the variance/covariance matrix of a vector of correlated EROC area estimates. In addition, we show that reliable confidence intervals can be obtained for EROC area, and we validate these intervals with Monte Carlo simulation. Application of our methodology is illustrated with an example comparing magnetic resonance imaging [Formula: see text]-space sampling trajectories. MATLAB® software implementing the EROC analysis estimators described in this work is publicly available at http://code.google.com/p/iqmodelo/. PMID:26158044

  13. Bayesian Nonparametric Estimation of Targeted Agent Effects on Biomarker Change to Predict Clinical Outcome

    PubMed Central

    Graziani, Rebecca; Guindani, Michele; Thall, Peter F.

    2015-01-01

    Summary The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212

  14. Bayesian nonparametric estimation of targeted agent effects on biomarker change to predict clinical outcome.

    PubMed

    Graziani, Rebecca; Guindani, Michele; Thall, Peter F

    2015-03-01

    The effect of a targeted agent on a cancer patient's clinical outcome putatively is mediated through the agent's effect on one or more early biological events. This is motivated by pre-clinical experiments with cells or animals that identify such events, represented by binary or quantitative biomarkers. When evaluating targeted agents in humans, central questions are whether the distribution of a targeted biomarker changes following treatment, the nature and magnitude of this change, and whether it is associated with clinical outcome. Major difficulties in estimating these effects are that a biomarker's distribution may be complex, vary substantially between patients, and have complicated relationships with clinical outcomes. We present a probabilistically coherent framework for modeling and estimation in this setting, including a hierarchical Bayesian nonparametric mixture model for biomarkers that we use to define a functional profile of pre-versus-post-treatment biomarker distribution change. The functional is similar to the receiver operating characteristic used in diagnostic testing. The hierarchical model yields clusters of individual patient biomarker profile functionals, and we use the profile as a covariate in a regression model for clinical outcome. The methodology is illustrated by analysis of a dataset from a clinical trial in prostate cancer using imatinib to target platelet-derived growth factor, with the clinical aim to improve progression-free survival time. PMID:25319212

  15. Nonparametric estimation receiver operating characteristic analysis for performance evaluation on combined detection and estimation tasks

    PubMed Central

    Wunderlich, Adam; Goossens, Bart

    2014-01-01

    Abstract. In an effort to generalize task-based assessment beyond traditional signal detection, there is a growing interest in performance evaluation for combined detection and estimation tasks, in which signal parameters, such as size, orientation, and contrast are unknown and must be estimated. One motivation for studying such tasks is their rich complexity, which offers potential advantages for imaging system optimization. To evaluate observer performance on combined detection and estimation tasks, Clarkson introduced the estimation receiver operating characteristic (EROC) curve and the area under the EROC curve as a summary figure of merit. This work provides practical tools for EROC analysis of experimental data. In particular, we propose nonparametric estimators for the EROC curve, the area under the EROC curve, and for the variance/covariance matrix of a vector of correlated EROC area estimates. In addition, we show that reliable confidence intervals can be obtained for EROC area, and we validate these intervals with Monte Carlo simulation. Application of our methodology is illustrated with an example comparing magnetic resonance imaging k-space sampling trajectories. MATLAB® software implementing the EROC analysis estimators described in this work is publicly available at http://code.google.com/p/iqmodelo/. PMID:26158044

  16. Nonlinear and multiresolution error covariance estimation in ensemble data assimilation

    NASA Astrophysics Data System (ADS)

    Rainwater, Sabrina

    Ensemble Kalman Filters perform data assimilation by forming a background covariance matrix from an ensemble forecast. The spread of the ensemble is intended to represent the algorithm's uncertainty about the state of the physical system that produces the data. Usually the ensemble members are evolved with the same model. The first part of my dissertation presents and tests a modified Local Ensemble Transform Kalman Filter (LETKF) that takes its background covariance from a combination of a high resolution ensemble and a low resolution ensemble. The computational time and the accuracy of this mixed-resolution LETKF are explored and compared to the standard LETKF on a high resolution ensemble, using simulated observation experiments with the Lorenz Models II and III (more complex versions of the Lorenz 96 model). The results show that, for the same computation time, mixed resolution ensemble analysis achieves higher accuracy than standard ensemble analysis. The second part of my dissertation demonstrates that it can be fruitful to rescale the ensemble spread prior to the forecast and then reverse this rescaling after the forecast. This technique, denoted “forecast spread adjustment'' provides a tunable parameter that is complementary to covariance inflation, which cumulatively increases the ensemble spread to compensate for underestimation of uncertainty. As the adjustable parameter approaches zero, the filter approaches the Extended Kalman Filter when the ensemble size is sufficiently large. The improvement provided by forecast spread adjustment depends on ensemble size, observation error, and model error. The results indicate that it is most effective for smaller ensembles, smaller observation errors, and larger model error, though the effectiveness depends significantly on the type of model error.

  17. On nonparametric comparison of images and regression surfaces

    PubMed Central

    Wang, Xiao-Feng; Ye, Deping

    2010-01-01

    Multivariate local regression is an important tool for image processing and analysis. In many practical biomedical problems, one is often interested in comparing a group of images or regression surfaces. In this paper, we extend the existing method of testing the equality of nonparametric curves by Dette and Neumeyer (2001) and consider a test statistic by means of an ℒ2-distance in the multi-dimensional case under a completely heteroscedastic nonparametric model. The test statistic is also extended to be used in the case of spatial correlated errors. Two bootstrap procedures are described in order to approximate the critical values of the test depending on the nature of random errors. The resulting algorithms and analyses are illustrated from both simulation studies and a real medical example. PMID:20543891

  18. A generalized framework for deriving nonparametric standardized drought indicators

    NASA Astrophysics Data System (ADS)

    Farahmand, Alireza; AghaKouchak, Amir

    2015-02-01

    This paper introduces the Standardized Drought Analysis Toolbox (SDAT) that offers a generalized framework for deriving nonparametric univariate and multivariate standardized indices. Current indicators suffer from deficiencies including temporal inconsistency, and statistical incomparability. Different indicators have varying scales and ranges and their values cannot be compared with each other directly. Most drought indicators rely on a representative parametric probability distribution function that fits the data. However, a parametric distribution function may not fit the data, especially in continental/global scale studies. SDAT is based on a nonparametric framework that can be applied to different climatic variables including precipitation, soil moisture and relative humidity, without having to assume representative parametric distributions. The most attractive feature of the framework is that it leads to statistically consistent drought indicators based on different variables.

  19. Nonparametric instrumental regression with non-convex constraints

    NASA Astrophysics Data System (ADS)

    Grasmair, M.; Scherzer, O.; Vanhems, A.

    2013-03-01

    This paper considers the nonparametric regression model with an additive error that is dependent on the explanatory variables. As is common in empirical studies in epidemiology and economics, it also supposes that valid instrumental variables are observed. A classical example in microeconomics considers the consumer demand function as a function of the price of goods and the income, both variables often considered as endogenous. In this framework, the economic theory also imposes shape restrictions on the demand function, such as integrability conditions. Motivated by this illustration in microeconomics, we study an estimator of a nonparametric constrained regression function using instrumental variables by means of Tikhonov regularization. We derive rates of convergence for the regularized model both in a deterministic and stochastic setting under the assumption that the true regression function satisfies a projected source condition including, because of the non-convexity of the imposed constraints, an additional smallness condition.

  20. Nonparametric estimation of plant density by the distance method

    USGS Publications Warehouse

    Patil, S.A.; Burnham, K.P.; Kovner, J.L.

    1979-01-01

    A relation between the plant density and the probability density function of the nearest neighbor distance (squared) from a random point is established under fairly broad conditions. Based upon this relationship, a nonparametric estimator for the plant density is developed and presented in terms of order statistics. Consistency and asymptotic normality of the estimator are discussed. An interval estimator for the density is obtained. The modifications of this estimator and its variance are given when the distribution is truncated. Simulation results are presented for regular, random and aggregated populations to illustrate the nonparametric estimator and its variance. A numerical example from field data is given. Merits and deficiencies of the estimator are discussed with regard to its robustness and variance.

  1. A Bayesian Nonparametric Approach to Image Super-Resolution.

    PubMed

    Polatkan, Gungor; Zhou, Mingyuan; Carin, Lawrence; Blei, David; Daubechies, Ingrid

    2015-02-01

    Super-resolution methods form high-resolution images from low-resolution images. In this paper, we develop a new Bayesian nonparametric model for super-resolution. Our method uses a beta-Bernoulli process to learn a set of recurring visual patterns, called dictionary elements, from the data. Because it is nonparametric, the number of elements found is also determined from the data. We test the results on both benchmark and natural images, comparing with several other models from the research literature. We perform large-scale human evaluation experiments to assess the visual quality of the results. In a first implementation, we use Gibbs sampling to approximate the posterior. However, this algorithm is not feasible for large-scale data. To circumvent this, we then develop an online variational Bayes (VB) algorithm. This algorithm finds high quality dictionaries in a fraction of the time needed by the Gibbs sampler. PMID:26353246

  2. A Nonparametric Approach for Mapping Quantitative Trait Loci

    PubMed Central

    Kruglyak, L.; Lander, E. S.

    1995-01-01

    Genetic mapping of quantitative trait loci (QTLs) is performed typically by using a parametric approach, based on the assumption that the phenotype follows a normal distribution. Many traits of interest, however, are not normally distributed. In this paper, we present a nonparametric approach to QTL mapping applicable to any phenotypic distribution. The method is based on a statistic Z(w), which generalizes the nonparametric Wilcoxon rank-sum test to the situation of whole-genome search by interval mapping. We determine the appropriate significance level for the statistic Z(w), by showing that its asymptotic null distribution follows an Ornstein-Uhlenbeck process. These results provide a robust, distribution-free method for mapping QTLs. PMID:7768449

  3. Supervised nonparametric sparse discriminant analysis for hyperspectral imagery classification

    NASA Astrophysics Data System (ADS)

    Wu, Longfei; Sun, Hao; Ji, Kefeng

    2016-03-01

    Owing to the high spectral sampling, the spectral information in hyperspectral imagery (HSI) is often highly correlated and contains redundancy. Motivated by the recent success of sparsity preserving based dimensionality reduction (DR) techniques in both computer vision and remote sensing image analysis community, a novel supervised nonparametric sparse discriminant analysis (NSDA) algorithm is presented for HSI classification. The objective function of NSDA aims at preserving the within-class sparse reconstructive relationship for within-class compactness characterization and maximizing the nonparametric between-class scatter simultaneously to enhance discriminative ability of the features in the projected space. Essentially, it seeks for the optimal projection matrix to identify the underlying discriminative manifold structure of a multiclass dataset. Experimental results on one visualization dataset and three recorded HSI dataset demonstrate NSDA outperforms several state-of-the-art feature extraction methods for HSI classification.

  4. Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.

    PubMed

    Deshwar, Amit G; Vembu, Shankar; Morris, Quaid

    2015-01-01

    Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor… PMID:25592565

  5. Point matching based on non-parametric model

    NASA Astrophysics Data System (ADS)

    Liu, Renfeng; Zhang, Cong; Tian, Jinwen

    2015-12-01

    Establishing reliable feature correspondence between two images is a fundamental problem in vision analysis and it is a critical prerequisite in a wide range of applications including structure-from-motion, 3D reconstruction, tracking, image retrieval, registration, and object recognition. The feature could be point, line, curve or surface, among which the point feature is primary and is the foundation of all features. Numerous techniques related to point matching have been proposed within a rich and extensive literature, which are typically studied under rigid/affine or non-rigid motion, corresponding to parametric and non-parametric models for the underlying image relations. In this paper, we provide a review of our previous work on point matching, focusing on nonparametric models. We also make an experimental comparison of the introduced methods, and discuss their advantages and disadvantages as well.

  6. Optimum nonparametric estimation of population density based on ordered distances

    USGS Publications Warehouse

    Patil, S.A.; Kovner, J.L.; Burnham, Kenneth P.

    1982-01-01

    The asymptotic mean and error mean square are determined for the nonparametric estimator of plant density by distance sampling proposed by Patil, Burnham and Kovner (1979, Biometrics 35, 597-604. On the basis of these formulae, a bias-reduced version of this estimator is given, and its specific form is determined which gives minimum mean square error under varying assumptions about the true probability density function of the sampled data. Extension is given to line-transect sampling.

  7. Nonparametric estimation of Fisher information from real data

    NASA Astrophysics Data System (ADS)

    Har-Shemesh, Omri; Quax, Rick; Miñano, Borja; Hoekstra, Alfons G.; Sloot, Peter M. A.

    2016-02-01

    The Fisher information matrix (FIM) is a widely used measure for applications including statistical inference, information geometry, experiment design, and the study of criticality in biological systems. The FIM is defined for a parametric family of probability distributions and its estimation from data follows one of two paths: either the distribution is assumed to be known and the parameters are estimated from the data or the parameters are known and the distribution is estimated from the data. We consider the latter case which is applicable, for example, to experiments where the parameters are controlled by the experimenter and a complicated relation exists between the input parameters and the resulting distribution of the data. Since we assume that the distribution is unknown, we use a nonparametric density estimation on the data and then compute the FIM directly from that estimate using a finite-difference approximation to estimate the derivatives in its definition. The accuracy of the estimate depends on both the method of nonparametric estimation and the difference Δ θ between the densities used in the finite-difference formula. We develop an approach for choosing the optimal parameter difference Δ θ based on large deviations theory and compare two nonparametric density estimation methods, the Gaussian kernel density estimator and a novel density estimation using field theory method. We also compare these two methods to a recently published approach that circumvents the need for density estimation by estimating a nonparametric f divergence and using it to approximate the FIM. We use the Fisher information of the normal distribution to validate our method and as a more involved example we compute the temperature component of the FIM in the two-dimensional Ising model and show that it obeys the expected relation to the heat capacity and therefore peaks at the phase transition at the correct critical temperature.

  8. Parameter inference with estimated covariance matrices

    NASA Astrophysics Data System (ADS)

    Sellentin, Elena; Heavens, Alan F.

    2016-02-01

    When inferring parameters from a Gaussian-distributed data set by computing a likelihood, a covariance matrix is needed that describes the data errors and their correlations. If the covariance matrix is not known a priori, it may be estimated and thereby becomes a random object with some intrinsic uncertainty itself. We show how to infer parameters in the presence of such an estimated covariance matrix, by marginalizing over the true covariance matrix, conditioned on its estimated value. This leads to a likelihood function that is no longer Gaussian, but rather an adapted version of a multivariate t-distribution, which has the same numerical complexity as the multivariate Gaussian. As expected, marginalization over the true covariance matrix improves inference when compared with Hartlap et al.'s method, which uses an unbiased estimate of the inverse covariance matrix but still assumes that the likelihood is Gaussian.

  9. A comparison of confounding adjustment methods with an application to early life determinants of childhood obesity.

    PubMed

    Li, L; Kleinman, K; Gillman, M W

    2014-12-01

    We implemented six confounding adjustment methods: (1) covariate-adjusted regression, (2) propensity score (PS) regression, (3) PS stratification, (4) PS matching with two calipers, (5) inverse probability weighting and (6) doubly robust estimation to examine the associations between the body mass index (BMI) z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding v. formula only (n=437) and cesarean section v. vaginal delivery (n=1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were -0.33 (95% CI -0.53, -0.13) and -0.24 (-0.46, -0.02), respectively. The other approaches resulted in smaller n (204-276) because of poor overlap of covariates, but CIs were of similar width except for inverse probability weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from -0.01 to -0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method. PMID:25171142

  10. A comparison of confounding adjustment methods with an application to early life determinants of childhood obesity

    PubMed Central

    Kleinman, Ken; Gillman, Matthew W.

    2014-01-01

    We implemented 6 confounding adjustment methods: 1) covariate-adjusted regression, 2) propensity score (PS) regression, 3) PS stratification, 4) PS matching with two calipers, 5) inverse-probability-weighting, and 6) doubly-robust estimation to examine the associations between the BMI z-score at 3 years and two separate dichotomous exposure measures: exclusive breastfeeding versus formula only (N = 437) and cesarean section versus vaginal delivery (N = 1236). Data were drawn from a prospective pre-birth cohort study, Project Viva. The goal is to demonstrate the necessity and usefulness, and approaches for multiple confounding adjustment methods to analyze observational data. Unadjusted (univariate) and covariate-adjusted linear regression associations of breastfeeding with BMI z-score were −0.33 (95% CI −0.53, −0.13) and −0.24 (−0.46, −0.02), respectively. The other approaches resulted in smaller N (204 to 276) because of poor overlap of covariates, but CIs were of similar width except for inverse-probability-weighting (75% wider) and PS matching with a wider caliper (76% wider). Point estimates ranged widely, however, from −0.01 to −0.38. For cesarean section, because of better covariate overlap, the covariate-adjusted regression estimate (0.20) was remarkably robust to all adjustment methods, and the widths of the 95% CIs differed less than in the breastfeeding example. Choice of covariate adjustment method can matter. Lack of overlap in covariate structure between exposed and unexposed participants in observational studies can lead to erroneous covariate-adjusted estimates and confidence intervals. We recommend inspecting covariate overlap and using multiple confounding adjustment methods. Similar results bring reassurance. Contradictory results suggest issues with either the data or the analytic method. PMID:25171142

  11. COVARIANCE ASSISTED SCREENING AND ESTIMATION

    PubMed Central

    Ke, By Tracy; Jin, Jiashun; Fan, Jianqing

    2014-01-01

    Consider a linear model Y = X β + z, where X = Xn,p and z ~ N(0, In). The vector β is unknown and it is of interest to separate its nonzero coordinates from the zero ones (i.e., variable selection). Motivated by examples in long-memory time series (Fan and Yao, 2003) and the change-point problem (Bhattacharya, 1994), we are primarily interested in the case where the Gram matrix G = X′X is non-sparse but sparsifiable by a finite order linear filter. We focus on the regime where signals are both rare and weak so that successful variable selection is very challenging but is still possible. We approach this problem by a new procedure called the Covariance Assisted Screening and Estimation (CASE). CASE first uses a linear filtering to reduce the original setting to a new regression model where the corresponding Gram (covariance) matrix is sparse. The new covariance matrix induces a sparse graph, which guides us to conduct multivariate screening without visiting all the submodels. By interacting with the signal sparsity, the graph enables us to decompose the original problem into many separated small-size subproblems (if only we know where they are!). Linear filtering also induces a so-called problem of information leakage, which can be overcome by the newly introduced patching technique. Together, these give rise to CASE, which is a two-stage Screen and Clean (Fan and Song, 2010; Wasserman and Roeder, 2009) procedure, where we first identify candidates of these submodels by patching and screening, and then re-examine each candidate to remove false positives. For any procedure β̂ for variable selection, we measure the performance by the minimax Hamming distance between the sign vectors of β̂ and β. We show that in a broad class of situations where the Gram matrix is non-sparse but sparsifiable, CASE achieves the optimal rate of convergence. The results are successfully applied to long-memory time series and the change-point model. PMID:25541567

  12. Nonparametric hemodynamic deconvolution of FMRI using homomorphic filtering.

    PubMed

    Sreenivasan, Karthik Ramakrishnan; Havlicek, Martin; Deshpande, Gopikrishna

    2015-05-01

    Functional magnetic resonance imaging (fMRI) is an indirect measure of neural activity which is modeled as a convolution of the latent neuronal response and the hemodynamic response function (HRF). Since the sources of HRF variability can be nonneural in nature, the measured fMRI signal does not faithfully represent underlying neural activity. Therefore, it is advantageous to deconvolve the HRF from the fMRI signal. However, since both latent neural activity and the voxel-specific HRF is unknown, the deconvolution must be blind. Existing blind deconvolution approaches employ highly parameterized models, and it is unclear whether these models have an over fitting problem. In order to address these issues, we 1) present a nonparametric deconvolution method based on homomorphic filtering to obtain the latent neuronal response from the fMRI signal and, 2) compare our approach to the best performing existing parametric model based on the estimation of the biophysical hemodynamic model using the Cubature Kalman Filter/Smoother. We hypothesized that if the results from nonparametric deconvolution closely resembled that obtained from parametric deconvolution, then the problem of over fitting during estimation in highly parameterized deconvolution models of fMRI could possibly be over stated. Both simulations and experimental results demonstrate support for our hypothesis since the estimated latent neural response from both parametric and nonparametric methods were highly correlated in the visual cortex. Further, simulations showed that both methods were effective in recovering the simulated ground truth of the latent neural response. PMID:25531878

  13. Nonparametric Analysis of Bivariate Gap Time with Competing Risks

    PubMed Central

    Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng

    2016-01-01

    Summary This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall’s tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article. PMID:26990686

  14. Covariance-enhanced discriminant analysis

    PubMed Central

    XU, PEIRONG; ZHU, JI; ZHU, LIXING; LI, YI

    2016-01-01

    Summary Linear discriminant analysis has been widely used to characterize or separate multiple classes via linear combinations of features. However, the high dimensionality of features from modern biological experiments defies traditional discriminant analysis techniques. Possible interfeature correlations present additional challenges and are often underused in modelling. In this paper, by incorporating possible interfeature correlations, we propose a covariance-enhanced discriminant analysis method that simultaneously and consistently selects informative features and identifies the corresponding discriminable classes. Under mild regularity conditions, we show that the method can achieve consistent parameter estimation and model selection, and can attain an asymptotically optimal misclassification rate. Extensive simulations have verified the utility of the method, which we apply to a renal transplantation trial.

  15. Nonparametric methods for microarray data based on exchangeability and borrowed power.

    PubMed

    Lee, Mei-Ling Ting; Whitmore, G A; Björkbacka, Harry; Freeman, Mason W

    2005-01-01

    This article proposes nonparametric inference procedures for analyzing microarray gene expression data that are reliable, robust, and simple to implement. They are conceptually transparent and require no special-purpose software. The analysis begins by normalizing gene expression data in a unique way. The resulting adjusted observations consist of gene-treatment interaction terms (representing differential expression) and error terms. The error terms are considered to be exchangeable, which is the only substantial assumption. Thus, under a family null hypothesis of no differential expression, the adjusted observations are exchangeable and all permutations of the observations are equally probable. The investigator may use the adjusted observations directly in a distribution-free test method or use their ranks in a rank-based method, where the ranking is taken over the whole data set. For the latter, the essential steps are as follows: (1) Calculate a Wilcoxon rank-sum difference or a corresponding Kruskal-Wallis rank statistic for each gene. (2) Randomly permute the observations and repeat the previous step. (3) Independently repeat the random permutation a suitable number of times. Under the exchangeability assumption, the permutation statistics are independent random draws from a null cumulative distribution function (c.d.f) approximated by the empirical c.d.f Reference to the empirical c.d.f tells if the test statistic for a gene is outlying and, hence, shows differential expression. This feature is judged by using an appropriate rejection region or computing a p-value for each test statistic, taking into account multiple testing. The distribution-free analog of the rank-based approach is also available and has parallel steps which are described in the article. The proposed nonparametric analysis tends to give good results with no additional refinement, although a few refinements are presented that may interest some investigators. The implementation is

  16. Assessing Covariates of Adolescent Delinquency Trajectories: A Latent Growth Mixture Modeling Approach.

    ERIC Educational Resources Information Center

    Wiesner, Margit; Windle, Michael

    2004-01-01

    Using data from a community sample of 1218 boys and girls (mean age at the first wave was 15.5 years), this longitudinal study examined several covariates -- adjustment problems, poor academic achievement, negative life events, and unsupportive family environments -- of distinctive trajectories of juvenile delinquency. Latent growth mixture…

  17. Quality Quantification of Evaluated Cross Section Covariances

    SciTech Connect

    Varet, S.; Dossantos-Uzarralde, P.

    2015-01-15

    Presently, several methods are used to estimate the covariance matrix of evaluated nuclear cross sections. Because the resulting covariance matrices can be different according to the method used and according to the assumptions of the method, we propose a general and objective approach to quantify the quality of the covariance estimation for evaluated cross sections. The first step consists in defining an objective criterion. The second step is computation of the criterion. In this paper the Kullback-Leibler distance is proposed for the quality quantification of a covariance matrix estimation and its inverse. It is based on the distance to the true covariance matrix. A method based on the bootstrap is presented for the estimation of this criterion, which can be applied with most methods for covariance matrix estimation and without the knowledge of the true covariance matrix. The full approach is illustrated on the {sup 85}Rb nucleus evaluations and the results are then used for a discussion on scoring and Monte Carlo approaches for covariance matrix estimation of the cross section evaluations.

  18. REGRESSION METHODS FOR DATA WITH INCOMPLETE COVARIATES

    EPA Science Inventory

    Modern statistical methods in chronic disease epidemiology allow simultaneous regression of disease status on several covariates. hese methods permit examination of the effects of one covariate while controlling for those of others that may be causally related to the disease. owe...

  19. Particle emission from covariant phase space

    SciTech Connect

    Bambah, B.A. )

    1992-12-01

    Using Lorentz-covariant sources, we calculate the multiplicity distribution of {ital n} pair correlated particles emerging from a Lorentz-covariant phase-space volume. We use the Kim-Wigner formalism and identify these sources as the squeezed states of a relativistic harmonic oscillator. The applications of this to multiplicity distributions in particle physics is discussed.

  20. Group Theory of Covariant Harmonic Oscillators

    ERIC Educational Resources Information Center

    Kim, Y. S.; Noz, Marilyn E.

    1978-01-01

    A simple and concrete example for illustrating the properties of noncompact groups is presented. The example is based on the covariant harmonic-oscillator formalism in which the relativistic wave functions carry a covariant-probability interpretation. This can be used in a group theory course for graduate students who have some background in…

  1. To adjust or not to adjust for baseline when analyzing repeated binary responses? The case of complete data when treatment comparison at study end is of interest.

    PubMed

    Jiang, Honghua; Kulkarni, Pandurang M; Mallinckrodt, Craig H; Shurzinske, Linda; Molenberghs, Geert; Lipkovich, Ilya

    2015-01-01

    The benefits of adjusting for baseline covariates are not as straightforward with repeated binary responses as with continuous response variables. Therefore, in this study, we compared different methods for analyzing repeated binary data through simulations when the outcome at the study endpoint is of interest. Methods compared included chi-square, Fisher's exact test, covariate adjusted/unadjusted logistic regression (Adj.logit/Unadj.logit), covariate adjusted/unadjusted generalized estimating equations (Adj.GEE/Unadj.GEE), covariate adjusted/unadjusted generalized linear mixed model (Adj.GLMM/Unadj.GLMM). All these methods preserved the type I error close to the nominal level. Covariate adjusted methods improved power compared with the unadjusted methods because of the increased treatment effect estimates, especially when the correlation between the baseline and outcome was strong, even though there was an apparent increase in standard errors. Results of the Chi-squared test were identical to those for the unadjusted logistic regression. Fisher's exact test was the most conservative test regarding the type I error rate and also with the lowest power. Without missing data, there was no gain in using a repeated measures approach over a simple logistic regression at the final time point. Analysis of results from five phase III diabetes trials of the same compound was consistent with the simulation findings. Therefore, covariate adjusted analysis is recommended for repeated binary data when the study endpoint is of interest. PMID:25866149

  2. Adjoints and Low-rank Covariance Representation

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; Cohn, Stephen E.

    2000-01-01

    Quantitative measures of the uncertainty of Earth System estimates can be as important as the estimates themselves. Second moments of estimation errors are described by the covariance matrix, whose direct calculation is impractical when the number of degrees of freedom of the system state is large. Ensemble and reduced-state approaches to prediction and data assimilation replace full estimation error covariance matrices by low-rank approximations. The appropriateness of such approximations depends on the spectrum of the full error covariance matrix, whose calculation is also often impractical. Here we examine the situation where the error covariance is a linear transformation of a forcing error covariance. We use operator norms and adjoints to relate the appropriateness of low-rank representations to the conditioning of this transformation. The analysis is used to investigate low-rank representations of the steady-state response to random forcing of an idealized discrete-time dynamical system.

  3. Combined Use of Integral Experiments and Covariance Data

    NASA Astrophysics Data System (ADS)

    Palmiotti, G.; Salvatores, M.; Aliberti, G.; Herman, M.; Hoblit, S. D.; McKnight, R. D.; Obložinský, P.; Talou, P.; Hale, G. M.; Hiruta, H.; Kawano, T.; Mattoon, C. M.; Nobre, G. P. A.; Palumbo, A.; Pigni, M.; Rising, M. E.; Yang, W.-S.; Kahler, A. C.

    2014-04-01

    In the frame of a US-DOE sponsored project, ANL, BNL, INL and LANL have performed a joint multidisciplinary research activity in order to explore the combined use of integral experiments and covariance data with the objective to both give quantitative indications on possible improvements of the ENDF evaluated data files and to reduce at the same time crucial reactor design parameter uncertainties. Methods that have been developed in the last four decades for the purposes indicated above have been improved by some new developments that benefited also by continuous exchanges with international groups working in similar areas. The major new developments that allowed significant progress are to be found in several specific domains: a) new science-based covariance data; b) integral experiment covariance data assessment and improved experiment analysis, e.g., of sample irradiation experiments; c) sensitivity analysis, where several improvements were necessary despite the generally good understanding of these techniques, e.g., to account for fission spectrum sensitivity; d) a critical approach to the analysis of statistical adjustments performance, both a priori and a posteriori; e) generalization of the assimilation method, now applied for the first time not only to multigroup cross sections data but also to nuclear model parameters (the "consistent" method). This article describes the major results obtained in each of these areas; a large scale nuclear data adjustment, based on the use of approximately one hundred high-accuracy integral experiments, will be reported along with a significant example of the application of the new "consistent" method of data assimilation.

  4. A Nonparametric Bayesian Approach For Emission Tomography Reconstruction

    NASA Astrophysics Data System (ADS)

    Barat, Éric; Dautremer, Thomas

    2007-11-01

    We introduce a PET reconstruction algorithm following a nonparametric Bayesian (NPB) approach. In contrast with Expectation Maximization (EM), the proposed technique does not rely on any space discretization. Namely, the activity distribution—normalized emission intensity of the spatial poisson process—is considered as a spatial probability density and observations are the projections of random emissions whose distribution has to be estimated. This approach is nonparametric in the sense that the quantity of interest belongs to the set of probability measures on Rk (for reconstruction in k-dimensions) and it is Bayesian in the sense that we define a prior directly on this spatial measure. In this context, we propose to model the nonparametric probability density as an infinite mixture of multivariate normal distributions. As a prior for this mixture we consider a Dirichlet Process Mixture (DPM) with a Normal-Inverse Wishart (NIW) model as base distribution of the Dirichlet Process. As in EM-family reconstruction, we use a data augmentation scheme where the set of hidden variables are the emission locations for each observed line of response in the continuous object space. Thanks to the data augmentation, we propose a Markov Chain Monte Carlo (MCMC) algorithm (Gibbs sampler) which is able to generate draws from the posterior distribution of the spatial intensity. A difference with EM is that one step of the Gibbs sampler corresponds to the generation of emission locations while only the expected number of emissions per pixel/voxel is used in EM. Another key difference is that the estimated spatial intensity is a continuous function such that there is no need to compute a projection matrix. Finally, draws from the intensity posterior distribution allow the estimation of posterior functionnals like the variance or confidence intervals. Results are presented for simulated data based on a 2D brain phantom and compared to Bayesian MAP-EM.

  5. Generalized additive models with interval-censored data and time-varying covariates: application to human immunodeficiency virus infection in hemophiliacs.

    PubMed

    Bacchetti, Peter; Quale, Christopher

    2002-06-01

    We describe a method for extending smooth nonparametric modeling methods to time-to-event data where the event may be known only to lie within a window of time. Maximum penalized likelihood is used to fit a discrete proportional hazards model that also models the baseline hazard, and left-truncation and time-varying covariates are accommodated. The implementation follows generalized additive modeling conventions, allowing both parametric and smooth terms and specifying the amount of smoothness in terms of the effective degrees of freedom. We illustrate the method on a well-known interval-censored data set on time of human immunodeficiency virus infection in a multicenter study of hemophiliacs. The ability to examine time-varying covariates, not available with previous methods, allows detection and modeling of nonproportional hazards and use of a time-varying covariate that fits the data better and is more plausible than a fixed alternative. PMID:12071419

  6. Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure

    PubMed Central

    Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas

    2015-01-01

    Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014

  7. Computation of nonparametric convex hazard estimators via profile methods

    PubMed Central

    Jankowski, Hanna K.; Wellner, Jon A.

    2010-01-01

    This paper proposes a profile likelihood algorithm to compute the nonparametric maximum likelihood estimator of a convex hazard function. The maximisation is performed in two steps: First the support reduction algorithm is used to maximise the likelihood over all hazard functions with a given point of minimum (or antimode). Then it is shown that the profile (or partially maximised) likelihood is quasi-concave as a function of the antimode, so that a bisection algorithm can be applied to find the maximum of the profile likelihood, and hence also the global maximum. The new algorithm is illustrated using both artificial and real data, including lifetime data for Canadian males and females. PMID:20300560

  8. Nonparametric maximum likelihood estimation for the multisample Wicksell corpuscle problem

    PubMed Central

    Chan, Kwun Chuen Gary; Qin, Jing

    2016-01-01

    We study nonparametric maximum likelihood estimation for the distribution of spherical radii using samples containing a mixture of one-dimensional, two-dimensional biased and three-dimensional unbiased observations. Since direct maximization of the likelihood function is intractable, we propose an expectation-maximization algorithm for implementing the estimator, which handles an indirect measurement problem and a sampling bias problem separately in the E- and M-steps, and circumvents the need to solve an Abel-type integral equation, which creates numerical instability in the one-sample problem. Extensions to ellipsoids are studied and connections to multiplicative censoring are discussed. PMID:27279657

  9. Approach to nonparametric cooperative multiband segmentation with adaptive threshold.

    PubMed

    Sebari, Imane; He, Dong-Chen

    2009-07-10

    We present a new nonparametric cooperative approach to multiband image segmentation. It is based on cooperation between region-growing segmentation and edge segmentation. This approach requires no input data other than the images to be processed. It uses a spectral homogeneity criterion whose threshold is determined automatically. The threshold is adaptive and varies depending on the objects to be segmented. Applying this new approach to very high resolution satellite imagery has yielded satisfactory results. The approach demonstrated its performance on images of varied complexity and was able to detect objects of great spatial and spectral heterogeneity. PMID:19593349

  10. Assessment of water quality trends in the Minnesota River using non-parametric and parametric methods

    USGS Publications Warehouse

    Johnson, H.O.; Gupta, S.C.; Vecchia, A.V.; Zvomuya, F.

    2009-01-01

    Excessive loading of sediment and nutrients to rivers is a major problem in many parts of the United States. In this study, we tested the non-parametric Seasonal Kendall (SEAKEN) trend model and the parametric USGS Quality of Water trend program (QWTREND) to quantify trends in water quality of the Minnesota River at Fort Snelling from 1976 to 2003. Both methods indicated decreasing trends in flow-adjusted concentrations of total suspended solids (TSS), total phosphorus (TP), and orthophosphorus (OP) and a generally increasing trend in flow-adjusted nitrate plus nitrite-nitrogen (NO3-N) concentration. The SEAKEN results were strongly influenced by the length of the record as well as extreme years (dry or wet) earlier in the record. The QWTREND results, though influenced somewhat by the same factors, were more stable. The magnitudes of trends between the two methods were somewhat different and appeared to be associated with conceptual differences between the flow-adjustment processes used and with data processing methods. The decreasing trends in TSS, TP, and OP concentrations are likely related to conservation measures implemented in the basin. However, dilution effects from wet climate or additional tile drainage cannot be ruled out. The increasing trend in NO3-N concentrations was likely due to increased drainage in the basin. Since the Minnesota River is the main source of sediments to the Mississippi River, this study also addressed the rapid filling of Lake Pepin on the Mississippi River and found the likely cause to be increased flow due to recent wet climate in the region. Copyright ?? 2009 by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America. All rights reserved.

  11. Doubly robust and efficient estimators for heteroscedastic partially linear single-index models allowing high dimensional covariates

    PubMed Central

    Ma, Yanyuan; Zhu, Liping

    2013-01-01

    Summary We study the heteroscedastic partially linear single-index model with an unspecified error variance function, which allows for high dimensional covariates in both the linear and the single-index components of the mean function. We propose a class of consistent estimators of the parameters by using a proper weighting strategy. An interesting finding is that the linearity condition which is widely assumed in the dimension reduction literature is not necessary for methodological or theoretical development: it contributes only to the simplification of non-optimal consistent estimation. We also find that the performance of the usual weighted least square type of estimators deteriorates when the non-parametric component is badly estimated. However, estimators in our family automatically provide protection against such deterioration, in that the consistency can be achieved even if the baseline non-parametric function is completely misspecified. We further show that the most efficient estimator is a member of this family and can be easily obtained by using non-parametric estimation. Properties of the estimators proposed are presented through theoretical illustration and numerical simulations. An example on gender discrimination is used to demonstrate and to compare the practical performance of the estimators. PMID:23970823

  12. Estimation of Data Uncertainty Adjustment Parameters for Multivariate Earth Rotation Series

    NASA Technical Reports Server (NTRS)

    Sung, Li-yu; Steppe, J. Alan

    1994-01-01

    We have developed a maximum likelihood method to estimate a set of data uncertainty adjustment parameters, iccluding scaling factors and additive variances and covariances, for multivariate Earth rotation series.

  13. A fuzzy, nonparametric segmentation framework for DTI and MRI analysis.

    PubMed

    Awate, Suyash P; Gee, James C

    2007-01-01

    This paper presents a novel statistical fuzzy-segmentation method for diffusion tensor (DT) images and magnetic resonance (MR) images. Typical fuzzy-segmentation schemes, e.g. those based on fuzzy-C-means (FCM), incorporate Gaussian class models which are inherently biased towards ellipsoidal clusters. Fiber bundles in DT images, however, comprise tensors that can inherently lie on more-complex manifolds. Unlike FCM-based schemes, the proposed method relies on modeling the manifolds underlying the classes by incorporating nonparametric data-driven statistical models. It produces an optimal fuzzy segmentation by maximizing a novel information-theoretic energy in a Markov-random-field framework. For DT images, the paper describes a consistent statistical technique for nonparametric modeling in Riemannian DT spaces that incorporates two very recent works. In this way, the proposed method provides uncertainties in the segmentation decisions, which stem from imaging artifacts including noise, partial voluming, and inhomogeneity. The paper shows results on synthetic and real, DT as well as MR images. PMID:17633708

  14. Nonparametric meta-analysis for diagnostic accuracy studies.

    PubMed

    Zapf, Antonia; Hoyer, Annika; Kramer, Katharina; Kuss, Oliver

    2015-12-20

    Summarizing the information of many studies using a meta-analysis becomes more and more important, also in the field of diagnostic studies. The special challenge in meta-analysis of diagnostic accuracy studies is that in general sensitivity and specificity are co-primary endpoints. Across the studies both endpoints are correlated, and this correlation has to be considered in the analysis. The standard approach for such a meta-analysis is the bivariate logistic random effects model. An alternative approach is to use marginal beta-binomial distributions for the true positives and the true negatives, linked by copula distributions. In this article, we propose a new, nonparametric approach of analysis, which has greater flexibility with respect to the correlation structure, and always converges. In a simulation study, it becomes apparent that the empirical coverage of all three approaches is in general below the nominal level. Regarding bias, empirical coverage, and mean squared error the nonparametric model is often superior to the standard model, and comparable with the copula model. The three approaches are also applied to two example meta-analyses. PMID:26174020

  15. Covariation bias in panic-prone individuals.

    PubMed

    Pauli, P; Montoya, P; Martz, G E

    1996-11-01

    Covariation estimates between fear-relevant (FR; emergency situations) or fear-irrelevant (FI; mushrooms and nudes) stimuli and an aversive outcome (electrical shock) were examined in 10 high-fear (panic-prone) and 10 low-fear respondents. When the relation between slide category and outcome was random (illusory correlation), only high-fear participants markedly overestimated the contingency between FR slides and shocks. However, when there was a high contingency of shocks following FR stimuli (83%) and a low contingency of shocks following FI stimuli (17%), the group difference vanished. Reversal of contingencies back to random induced a covariation bias for FR slides in high- and low-fear respondents. Results indicate that panic-prone respondents show a covariation bias for FR stimuli and that the experience of a high contingency between FR slides and aversive outcomes may foster such a covariation bias even in low-fear respondents. PMID:8952200

  16. Reconciling Covariances with Reliable Orbital Uncertainty

    NASA Astrophysics Data System (ADS)

    Folcik, Z.; Lue, A.; Vatsky, J.

    2011-09-01

    There is a common suspicion that formal covariances do not represent a realistic measure of orbital uncertainties. By devising metrics for measuring the representations of orbit error, we assess under what circumstances such lore is justified as well as the root cause of the discrepancy between the mathematics of orbital uncertainty and its practical implementation. We offer a scheme by which formal covariances may be adapted to be an accurate measure of orbital uncertainties and show how that adaptation performs against both simulated and real space-object data. We also apply these covariance adaptation methods to the process of observation association using many simulated and real data test cases. We demonstrate that covariance-informed observation association can be reliable, even in the case when only two tracks are available. Satellite breakup and collision event catalog maintenance could benefit from the automation made possible with these association methods.

  17. Phase-covariant quantum cloning of qudits

    SciTech Connect

    Fan Heng; Imai, Hiroshi; Matsumoto, Keiji; Wang, Xiang-Bin

    2003-02-01

    We study the phase-covariant quantum cloning machine for qudits, i.e., the input states in a d-level quantum system have complex coefficients with arbitrary phase but constant module. A cloning unitary transformation is proposed. After optimizing the fidelity between input state and single qudit reduced density operator of output state, we obtain the optimal fidelity for 1 to 2 phase-covariant quantum cloning of qudits and the corresponding cloning transformation.

  18. Noncommutative Gauge Theory with Covariant Star Product

    SciTech Connect

    Zet, G.

    2010-08-04

    We present a noncommutative gauge theory with covariant star product on a space-time with torsion. In order to obtain the covariant star product one imposes some restrictions on the connection of the space-time. Then, a noncommutative gauge theory is developed applying this product to the case of differential forms. Some comments on the advantages of using a space-time with torsion to describe the gravitational field are also given.

  19. Covariant action for type IIB supergravity

    NASA Astrophysics Data System (ADS)

    Sen, Ashoke

    2016-07-01

    Taking clues from the recent construction of the covariant action for type II and heterotic string field theories, we construct a manifestly Lorentz covariant action for type IIB supergravity, and discuss its gauge fixing maintaining manifest Lorentz invariance. The action contains a (non-gravitating) free 4-form field besides the usual fields of type IIB supergravity. This free field, being completely decoupled from the interacting sector, has no physical consequence.

  20. Covariate analysis of bivariate survival data

    SciTech Connect

    Bennett, L.E.

    1992-01-01

    The methods developed are used to analyze the effects of covariates on bivariate survival data when censoring and ties are present. The proposed method provides models for bivariate survival data that include differential covariate effects and censored observations. The proposed models are based on an extension of the univariate Buckley-James estimators which replace censored data points by their expected values, conditional on the censoring time and the covariates. For the bivariate situation, it is necessary to determine the expectation of the failure times for one component conditional on the failure or censoring time of the other component. Two different methods have been developed to estimate these expectations. In the semiparametric approach these expectations are determined from a modification of Burke's estimate of the bivariate empirical survival function. In the parametric approach censored data points are also replaced by their conditional expected values where the expected values are determined from a specified parametric distribution. The model estimation will be based on the revised data set, comprised of uncensored components and expected values for the censored components. The variance-covariance matrix for the estimated covariate parameters has also been derived for both the semiparametric and parametric methods. Data from the Demographic and Health Survey was analyzed by these methods. The two outcome variables are post-partum amenorrhea and breastfeeding; education and parity were used as the covariates. Both the covariate parameter estimates and the variance-covariance estimates for the semiparametric and parametric models will be compared. In addition, a multivariate test statistic was used in the semiparametric model to examine contrasts. The significance of the statistic was determined from a bootstrap distribution of the test statistic.

  1. Low-dimensional Representation of Error Covariance

    NASA Technical Reports Server (NTRS)

    Tippett, Michael K.; Cohn, Stephen E.; Todling, Ricardo; Marchesin, Dan

    2000-01-01

    Ensemble and reduced-rank approaches to prediction and assimilation rely on low-dimensional approximations of the estimation error covariances. Here stability properties of the forecast/analysis cycle for linear, time-independent systems are used to identify factors that cause the steady-state analysis error covariance to admit a low-dimensional representation. A useful measure of forecast/analysis cycle stability is the bound matrix, a function of the dynamics, observation operator and assimilation method. Upper and lower estimates for the steady-state analysis error covariance matrix eigenvalues are derived from the bound matrix. The estimates generalize to time-dependent systems. If much of the steady-state analysis error variance is due to a few dominant modes, the leading eigenvectors of the bound matrix approximate those of the steady-state analysis error covariance matrix. The analytical results are illustrated in two numerical examples where the Kalman filter is carried to steady state. The first example uses the dynamics of a generalized advection equation exhibiting nonmodal transient growth. Failure to observe growing modes leads to increased steady-state analysis error variances. Leading eigenvectors of the steady-state analysis error covariance matrix are well approximated by leading eigenvectors of the bound matrix. The second example uses the dynamics of a damped baroclinic wave model. The leading eigenvectors of a lowest-order approximation of the bound matrix are shown to approximate well the leading eigenvectors of the steady-state analysis error covariance matrix.

  2. On the Use of Nonparametric Item Characteristic Curve Estimation Techniques for Checking Parametric Model Fit

    ERIC Educational Resources Information Center

    Lee, Young-Sun; Wollack, James A.; Douglas, Jeffrey

    2009-01-01

    The purpose of this study was to assess the model fit of a 2PL through comparison with the nonparametric item characteristic curve (ICC) estimation procedures. Results indicate that three nonparametric procedures implemented produced ICCs that are similar to that of the 2PL for items simulated to fit the 2PL. However for misfitting items,…

  3. Nonparametric Estimation of Item and Respondent Locations from Unfolding-Type Items

    ERIC Educational Resources Information Center

    Johnson, Matthew S.

    2006-01-01

    Unlike their monotone counterparts, nonparametric unfolding response models, which assume the item response function is unimodal, have seen little attention in the psychometric literature. This paper studies the nonparametric behavior of unfolding models by building on the work of Post (1992). The paper provides rigorous justification for a class…

  4. The non-parametric Parzen's window in stereo vision matching.

    PubMed

    Pajares, G; de la Cruz, J

    2002-01-01

    This paper presents an approach to the local stereovision matching problem using edge segments as features with four attributes. From these attributes we compute a matching probability between pairs of features of the stereo images. A correspondence is said true when such a probability is maximum. We introduce a nonparametric strategy based on Parzen's window (1962) to estimate a probability density function (PDF) which is used to obtain the matching probability. This is the main finding of the paper. A comparative analysis of other recent matching methods is included to show that this finding can be justified theoretically. A generalization of the proposed method is made in order to give guidelines about its use with the similarity constraint and also in different environments where other features and attributes are more suitable. PMID:18238122

  5. Analyzing multiple spike trains with nonparametric Granger causality.

    PubMed

    Nedungadi, Aatira G; Rangarajan, Govindan; Jain, Neeraj; Ding, Mingzhou

    2009-08-01

    Simultaneous recordings of spike trains from multiple single neurons are becoming commonplace. Understanding the interaction patterns among these spike trains remains a key research area. A question of interest is the evaluation of information flow between neurons through the analysis of whether one spike train exerts causal influence on another. For continuous-valued time series data, Granger causality has proven an effective method for this purpose. However, the basis for Granger causality estimation is autoregressive data modeling, which is not directly applicable to spike trains. Various filtering options distort the properties of spike trains as point processes. Here we propose a new nonparametric approach to estimate Granger causality directly from the Fourier transforms of spike train data. We validate the method on synthetic spike trains generated by model networks of neurons with known connectivity patterns and then apply it to neurons simultaneously recorded from the thalamus and the primary somatosensory cortex of a squirrel monkey undergoing tactile stimulation. PMID:19137420

  6. Fast Nonparametric Clustering of Structured Time-Series.

    PubMed

    Hensman, James; Rattray, Magnus; Lawrence, Neil D

    2015-02-01

    In this publication, we combine two Bayesian nonparametric models: the Gaussian Process (GP) and the Dirichlet Process (DP). Our innovation in the GP model is to introduce a variation on the GP prior which enables us to model structured time-series data, i.e., data containing groups where we wish to model inter- and intra-group variability. Our innovation in the DP model is an implementation of a new fast collapsed variational inference procedure which enables us to optimize our variational approximation significantly faster than standard VB approaches. In a biological time series application we show how our model better captures salient features of the data, leading to better consistency with existing biological classifications, while the associated inference algorithm provides a significant speed-up over EM-based variational inference. PMID:26353249

  7. Gradient-based manipulation of nonparametric entropy estimates.

    PubMed

    Schraudolph, Nicol N

    2004-07-01

    This paper derives a family of differential learning rules that optimize the Shannon entropy at the output of an adaptive system via kernel density estimation. In contrast to parametric formulations of entropy, this nonparametric approach assumes no particular functional form of the output density. We address problems associated with quantized data and finite sample size, and implement efficient maximum likelihood techniques for optimizing the regularizer. We also develop a normalized entropy estimate that is invariant with respect to affine transformations, facilitating optimization of the shape, rather than the scale, of the output density. Kernel density estimates are smooth and differentiable; this makes the derived entropy estimates amenable to manipulation by gradient descent. The resulting weight updates are surprisingly simple and efficient learning rules that operate on pairs of input samples. They can be tuned for data-limited or memory-limited situations, or modified to give a fully online implementation. PMID:15461076

  8. Nonparametric forecasting of low-dimensional dynamical systems

    NASA Astrophysics Data System (ADS)

    Berry, Tyrus; Giannakis, Dimitrios; Harlim, John

    2015-03-01

    This paper presents a nonparametric modeling approach for forecasting stochastic dynamical systems on low-dimensional manifolds. The key idea is to represent the discrete shift maps on a smooth basis which can be obtained by the diffusion maps algorithm. In the limit of large data, this approach converges to a Galerkin projection of the semigroup solution to the underlying dynamics on a basis adapted to the invariant measure. This approach allows one to quantify uncertainties (in fact, evolve the probability distribution) for nontrivial dynamical systems with equation-free modeling. We verify our approach on various examples, ranging from an inhomogeneous anisotropic stochastic differential equation on a torus, the chaotic Lorenz three-dimensional model, and the Niño-3.4 data set which is used as a proxy of the El Niño Southern Oscillation.

  9. Nonparametric Facial Feature Localization Using Segment-Based Eigenfeatures.

    PubMed

    Choi, Hyun-Chul; Sibbing, Dominik; Kobbelt, Leif

    2016-01-01

    We present a nonparametric facial feature localization method using relative directional information between regularly sampled image segments and facial feature points. Instead of using any iterative parameter optimization technique or search algorithm, our method finds the location of facial feature points by using a weighted concentration of the directional vectors originating from the image segments pointing to the expected facial feature positions. Each directional vector is calculated by linear combination of eigendirectional vectors which are obtained by a principal component analysis of training facial segments in feature space of histogram of oriented gradient (HOG). Our method finds facial feature points very fast and accurately, since it utilizes statistical reasoning from all the training data without need to extract local patterns at the estimated positions of facial features, any iterative parameter optimization algorithm, and any search algorithm. In addition, we can reduce the storage size for the trained model by controlling the energy preserving level of HOG pattern space. PMID:26819588

  10. Nonparametric Facial Feature Localization Using Segment-Based Eigenfeatures

    PubMed Central

    Choi, Hyun-Chul; Sibbing, Dominik; Kobbelt, Leif

    2016-01-01

    We present a nonparametric facial feature localization method using relative directional information between regularly sampled image segments and facial feature points. Instead of using any iterative parameter optimization technique or search algorithm, our method finds the location of facial feature points by using a weighted concentration of the directional vectors originating from the image segments pointing to the expected facial feature positions. Each directional vector is calculated by linear combination of eigendirectional vectors which are obtained by a principal component analysis of training facial segments in feature space of histogram of oriented gradient (HOG). Our method finds facial feature points very fast and accurately, since it utilizes statistical reasoning from all the training data without need to extract local patterns at the estimated positions of facial features, any iterative parameter optimization algorithm, and any search algorithm. In addition, we can reduce the storage size for the trained model by controlling the energy preserving level of HOG pattern space. PMID:26819588

  11. Bayesian Nonparametric Shrinkage Applied to Cepheid Star Oscillations.

    PubMed

    Berger, James; Jefferys, William; Müller, Peter

    2012-01-01

    Bayesian nonparametric regression with dependent wavelets has dual shrinkage properties: there is shrinkage through a dependent prior put on functional differences, and shrinkage through the setting of most of the wavelet coefficients to zero through Bayesian variable selection methods. The methodology can deal with unequally spaced data and is efficient because of the existence of fast moves in model space for the MCMC computation. The methodology is illustrated on the problem of modeling the oscillations of Cepheid variable stars; these are a class of pulsating variable stars with the useful property that their periods of variability are strongly correlated with their absolute luminosity. Once this relationship has been calibrated, knowledge of the period gives knowledge of the luminosity. This makes these stars useful as "standard candles" for estimating distances in the universe. PMID:24368873

  12. A non-parametric segmentation methodology for oral videocapillaroscopic images.

    PubMed

    Bellavia, Fabio; Cacioppo, Antonino; Lupaşcu, Carmen Alina; Messina, Pietro; Scardina, Giuseppe; Tegolo, Domenico; Valenti, Cesare

    2014-05-01

    We aim to describe a new non-parametric methodology to support the clinician during the diagnostic process of oral videocapillaroscopy to evaluate peripheral microcirculation. Our methodology, mainly based on wavelet analysis and mathematical morphology to preprocess the images, segments them by minimizing the within-class luminosity variance of both capillaries and background. Experiments were carried out on a set of real microphotographs to validate this approach versus handmade segmentations provided by physicians. By using a leave-one-patient-out approach, we pointed out that our methodology is robust, according to precision-recall criteria (average precision and recall are equal to 0.924 and 0.923, respectively) and it acts as a physician in terms of the Jaccard index (mean and standard deviation equal to 0.858 and 0.064, respectively). PMID:24657094

  13. Binary Classifier Calibration Using a Bayesian Non-Parametric Approach

    PubMed Central

    Naeini, Mahdi Pakdaman; Cooper, Gregory F.; Hauskrecht, Milos

    2015-01-01

    Learning probabilistic predictive models that are well calibrated is critical for many prediction and decision-making tasks in Data mining. This paper presents two new non-parametric methods for calibrating outputs of binary classification models: a method based on the Bayes optimal selection and a method based on the Bayesian model averaging. The advantage of these methods is that they are independent of the algorithm used to learn a predictive model, and they can be applied in a post-processing step, after the model is learned. This makes them applicable to a wide variety of machine learning models and methods. These calibration methods, as well as other methods, are tested on a variety of datasets in terms of both discrimination and calibration performance. The results show the methods either outperform or are comparable in performance to the state-of-the-art calibration methods. PMID:26613068

  14. Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference

    PubMed Central

    Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.

    2015-01-01

    The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922

  15. Nonparametric forecasting of low-dimensional dynamical systems.

    PubMed

    Berry, Tyrus; Giannakis, Dimitrios; Harlim, John

    2015-03-01

    This paper presents a nonparametric modeling approach for forecasting stochastic dynamical systems on low-dimensional manifolds. The key idea is to represent the discrete shift maps on a smooth basis which can be obtained by the diffusion maps algorithm. In the limit of large data, this approach converges to a Galerkin projection of the semigroup solution to the underlying dynamics on a basis adapted to the invariant measure. This approach allows one to quantify uncertainties (in fact, evolve the probability distribution) for nontrivial dynamical systems with equation-free modeling. We verify our approach on various examples, ranging from an inhomogeneous anisotropic stochastic differential equation on a torus, the chaotic Lorenz three-dimensional model, and the Niño-3.4 data set which is used as a proxy of the El Niño Southern Oscillation. PMID:25871180

  16. Non-parametric transient classification using adaptive wavelets

    NASA Astrophysics Data System (ADS)

    Varughese, Melvin M.; von Sachs, Rainer; Stephanou, Michael; Bassett, Bruce A.

    2015-11-01

    Classifying transients based on multiband light curves is a challenging but crucial problem in the era of GAIA and Large Synoptic Sky Telescope since the sheer volume of transients will make spectroscopic classification unfeasible. We present a non-parametric classifier that predicts the transient's class given training data. It implements two novel components: the use of the BAGIDIS wavelet methodology - a characterization of functional data using hierarchical wavelet coefficients - as well as the introduction of a ranked probability classifier on the wavelet coefficients that handles both the heteroscedasticity of the data in addition to the potential non-representativity of the training set. The classifier is simple to implement while a major advantage of the BAGIDIS wavelets is that they are translation invariant. Hence, BAGIDIS does not need the light curves to be aligned to extract features. Further, BAGIDIS is non-parametric so it can be used effectively in blind searches for new objects. We demonstrate the effectiveness of our classifier against the Supernova Photometric Classification Challenge to correctly classify supernova light curves as Type Ia or non-Ia. We train our classifier on the spectroscopically confirmed subsample (which is not representative) and show that it works well for supernova with observed light-curve time spans greater than 100 d (roughly 55 per cent of the data set). For such data, we obtain a Ia efficiency of 80.5 per cent and a purity of 82.4 per cent, yielding a highly competitive challenge score of 0.49. This indicates that our `model-blind' approach may be particularly suitable for the general classification of astronomical transients in the era of large synoptic sky surveys.

  17. Hyperspectral image segmentation using a cooperative nonparametric approach

    NASA Astrophysics Data System (ADS)

    Taher, Akar; Chehdi, Kacem; Cariou, Claude

    2013-10-01

    In this paper a new unsupervised nonparametric cooperative and adaptive hyperspectral image segmentation approach is presented. The hyperspectral images are partitioned band by band in parallel and intermediate classification results are evaluated and fused, to get the final segmentation result. Two unsupervised nonparametric segmentation methods are used in parallel cooperation, namely the Fuzzy C-means (FCM) method, and the Linde-Buzo-Gray (LBG) algorithm, to segment each band of the image. The originality of the approach relies firstly on its local adaptation to the type of regions in an image (textured, non-textured), and secondly on the introduction of several levels of evaluation and validation of intermediate segmentation results before obtaining the final partitioning of the image. For the management of similar or conflicting results issued from the two classification methods, we gradually introduced various assessment steps that exploit the information of each spectral band and its adjacent bands, and finally the information of all the spectral bands. In our approach, the detected textured and non-textured regions are treated separately from feature extraction step, up to the final classification results. This approach was first evaluated on a large number of monocomponent images constructed from the Brodatz album. Then it was evaluated on two real applications using a respectively multispectral image for Cedar trees detection in the region of Baabdat (Lebanon) and a hyperspectral image for identification of invasive and non invasive vegetation in the region of Cieza (Spain). A correct classification rate (CCR) for the first application is over 97% and for the second application the average correct classification rate (ACCR) is over 99%.

  18. A Covariance Generation Methodology for Fission Product Yields

    NASA Astrophysics Data System (ADS)

    Terranova, N.; Serot, O.; Archier, P.; Vallet, V.; De Saint Jean, C.; Sumini, M.

    2016-03-01

    Recent safety and economical concerns for modern nuclear reactor applications have fed an outstanding interest in basic nuclear data evaluation improvement and completion. It has been immediately clear that the accuracy of our predictive simulation models was strongly affected by our knowledge on input data. Therefore strong efforts have been made to improve nuclear data and to generate complete and reliable uncertainty information able to yield proper uncertainty propagation on integral reactor parameters. Since in modern nuclear data banks (such as JEFF-3.1.1 and ENDF/BVII.1) no correlations for fission yields are given, in the present work we propose a covariance generation methodology for fission product yields. The main goal is to reproduce the existing European library and to add covariance information to allow proper uncertainty propagation in depletion and decay heat calculations. To do so, we adopted the Generalized Least Square Method (GLSM) implemented in CONRAD (COde for Nuclear Reaction Analysis and Data assimilation), developed at CEA-Cadarache. Theoretical values employed in the Bayesian parameter adjustment are delivered thanks to a convolution of different models, representing several quantities in fission yield calculations: the Brosa fission modes for pre-neutron mass distribution, a simplified Gaussian model for prompt neutron emission probability, theWahl systematics for charge distribution and the Madland-England model for the isomeric ratio. Some results will be presented for the thermal fission of U-235, Pu-239 and Pu-241.

  19. SCALE-6 Sensitivity/Uncertainty Methods and Covariance Data

    SciTech Connect

    Williams, Mark L; Rearden, Bradley T

    2008-01-01

    Computational methods and data used for sensitivity and uncertainty analysis within the SCALE nuclear analysis code system are presented. The methodology used to calculate sensitivity coefficients and similarity coefficients and to perform nuclear data adjustment is discussed. A description is provided of the SCALE-6 covariance library based on ENDF/B-VII and other nuclear data evaluations, supplemented by 'low-fidelity' approximate covariances. SCALE (Standardized Computer Analyses for Licensing Evaluation) is a modular code system developed by Oak Ridge National Laboratory (ORNL) to perform calculations for criticality safety, reactor physics, and radiation shielding applications. SCALE calculations typically use sequences that execute a predefined series of executable modules to compute particle fluxes and responses like the critical multiplication factor. SCALE also includes modules for sensitivity and uncertainty (S/U) analysis of calculated responses. The S/U codes in SCALE are collectively referred to as TSUNAMI (Tools for Sensitivity and UNcertainty Analysis Methodology Implementation). SCALE-6-scheduled for release in 2008-contains significant new capabilities, including important enhancements in S/U methods and data. The main functions of TSUNAMI are to (a) compute nuclear data sensitivity coefficients and response uncertainties, (b) establish similarity between benchmark experiments and design applications, and (c) reduce uncertainty in calculated responses by consolidating integral benchmark experiments. TSUNAMI includes easy-to-use graphical user interfaces for defining problem input and viewing three-dimensional (3D) geometries, as well as an integrated plotting package.

  20. Spacetime states and covariant quantum theory

    NASA Astrophysics Data System (ADS)

    Reisenberger, Michael; Rovelli, Carlo

    2002-06-01

    In its usual presentation, classical mechanics appears to give time a very special role. But it is well known that mechanics can be formulated so as to treat the time variable on the same footing as the other variables in the extended configuration space. Such covariant formulations are natural for relativistic gravitational systems, where general covariance conflicts with the notion of a preferred physical-time variable. The standard presentation of quantum mechanics, in turn, again gives time a very special role, raising well known difficulties for quantum gravity. Is there a covariant form of (canonical) quantum mechanics? We observe that the preferred role of time in quantum theory is the consequence of an idealization: that measurements are instantaneous. Canonical quantum theory can be given a covariant form by dropping this idealization. States prepared by noninstantaneous measurements are described by ``spacetime smeared states.'' The theory can be formulated in terms of these states, without making any reference to a special time variable. The quantum dynamics is expressed in terms of the propagator, an object covariantly defined on the extended configuration space.

  1. Isavuconazole Population Pharmacokinetic Analysis Using Nonparametric Estimation in Patients with Invasive Fungal Disease (Results from the VITAL Study)

    PubMed Central

    Kovanda, Laura L.; Desai, Amit V.; Lu, Qiaoyang; Townsend, Robert W.; Akhtar, Shahzad; Bonate, Peter

    2016-01-01

    Isavuconazonium sulfate (Cresemba; Astellas Pharma Inc.), a water-soluble prodrug of the triazole antifungal agent isavuconazole, is available for the treatment of invasive aspergillosis (IA) and invasive mucormycosis. A population pharmacokinetic (PPK) model was constructed using nonparametric estimation to compare the pharmacokinetic (PK) behaviors of isavuconazole in patients treated in the phase 3 VITAL open-label clinical trial, which evaluated the efficacy and safety of the drug for treatment of renally impaired IA patients and patients with invasive fungal disease (IFD) caused by emerging molds, yeasts, and dimorphic fungi. Covariates examined were body mass index (BMI), weight, race, impact of estimated glomerular filtration rate (eGFR) on clearance (CL), and impact of weight on volume. PK parameters were compared based on IFD type and other patient characteristics. Simulations were performed to describe the MICs covered by the clinical dosing regimen. Concentrations (n = 458) from 136 patients were used to construct a 2-compartment model (first-order absorption compartment and central compartment). Weight-related covariates affected clearance, but eGFR did not. PK parameters and intersubject variability of CL were similar across different IFD groups and populations. Target attainment analyses demonstrated that the clinical dosing regimen would be sufficient for total drug area under the concentration-time curve (AUC)/MIC targets ranging from 50.5 for Aspergillus spp. (up to the CLSI MIC of 0.5 mg/liter) to 270 and 5,053 for Candida albicans (up to MICs of 0.125 and 0.004 mg/liter, respectively) and 312 for non-albicans Candida spp. (up to a MIC of 0.125 mg/liter). The estimations for Candida spp. were exploratory considering that no patients with Candida infections were included in the current analyses. (The VITAL trial is registered at ClinicalTrials.gov under number NCT00634049.) PMID:27185799

  2. Isavuconazole Population Pharmacokinetic Analysis Using Nonparametric Estimation in Patients with Invasive Fungal Disease (Results from the VITAL Study).

    PubMed

    Kovanda, Laura L; Desai, Amit V; Lu, Qiaoyang; Townsend, Robert W; Akhtar, Shahzad; Bonate, Peter; Hope, William W

    2016-08-01

    Isavuconazonium sulfate (Cresemba; Astellas Pharma Inc.), a water-soluble prodrug of the triazole antifungal agent isavuconazole, is available for the treatment of invasive aspergillosis (IA) and invasive mucormycosis. A population pharmacokinetic (PPK) model was constructed using nonparametric estimation to compare the pharmacokinetic (PK) behaviors of isavuconazole in patients treated in the phase 3 VITAL open-label clinical trial, which evaluated the efficacy and safety of the drug for treatment of renally impaired IA patients and patients with invasive fungal disease (IFD) caused by emerging molds, yeasts, and dimorphic fungi. Covariates examined were body mass index (BMI), weight, race, impact of estimated glomerular filtration rate (eGFR) on clearance (CL), and impact of weight on volume. PK parameters were compared based on IFD type and other patient characteristics. Simulations were performed to describe the MICs covered by the clinical dosing regimen. Concentrations (n = 458) from 136 patients were used to construct a 2-compartment model (first-order absorption compartment and central compartment). Weight-related covariates affected clearance, but eGFR did not. PK parameters and intersubject variability of CL were similar across different IFD groups and populations. Target attainment analyses demonstrated that the clinical dosing regimen would be sufficient for total drug area under the concentration-time curve (AUC)/MIC targets ranging from 50.5 for Aspergillus spp. (up to the CLSI MIC of 0.5 mg/liter) to 270 and 5,053 for Candida albicans (up to MICs of 0.125 and 0.004 mg/liter, respectively) and 312 for non-albicans Candida spp. (up to a MIC of 0.125 mg/liter). The estimations for Candida spp. were exploratory considering that no patients with Candida infections were included in the current analyses. (The VITAL trial is registered at ClinicalTrials.gov under number NCT00634049.). PMID:27185799

  3. FAST NEUTRON COVARIANCES FOR EVALUATED DATA FILES.

    SciTech Connect

    HERMAN, M.; OBLOZINSKY, P.; ROCHMAN, D.; KAWANO, T.; LEAL, L.

    2006-06-05

    We describe implementation of the KALMAN code in the EMPIRE system and present first covariance data generated for Gd and Ir isotopes. A complete set of covariances, in the full energy range, was produced for the chain of 8 Gadolinium isotopes for total, elastic, capture, total inelastic (MT=4), (n,2n), (n,p) and (n,alpha) reactions. Our correlation matrices, based on combination of model calculations and experimental data, are characterized by positive mid-range and negative long-range correlations. They differ from the model-generated covariances that tend to show strong positive long-range correlations and those determined solely from experimental data that result in nearly diagonal matrices. We have studied shapes of correlation matrices obtained in the calculations and interpreted them in terms of the underlying reaction models. An important result of this study is the prediction of narrow energy ranges with extremely small uncertainties for certain reactions (e.g., total and elastic).

  4. Gram-Schmidt algorithms for covariance propagation

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.; Bierman, G. J.

    1977-01-01

    This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UD(transpose of U), where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and coloured process noise parameters increase mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.

  5. Gram-Schmidt algorithms for covariance propagation

    NASA Technical Reports Server (NTRS)

    Thornton, C. L.; Bierman, G. J.

    1975-01-01

    This paper addresses the time propagation of triangular covariance factors. Attention is focused on the square-root free factorization, P = UDU/T/, where U is unit upper triangular and D is diagonal. An efficient and reliable algorithm for U-D propagation is derived which employs Gram-Schmidt orthogonalization. Partitioning the state vector to distinguish bias and colored process noise parameters increases mapping efficiency. Cost comparisons of the U-D, Schmidt square-root covariance and conventional covariance propagation methods are made using weighted arithmetic operation counts. The U-D time update is shown to be less costly than the Schmidt method; and, except in unusual circumstances, it is within 20% of the cost of conventional propagation.

  6. A violation of the covariant entropy bound?

    NASA Astrophysics Data System (ADS)

    Masoumi, Ali; Mathur, Samir D.

    2015-04-01

    Several arguments suggest that the entropy density at high energy density ρ should be given by the expression s =K √{ρ /G } , where K is a constant of order unity. On the other hand the covariant entropy bound requires that the entropy on a light sheet be bounded by A /4 G , where A is the area of the boundary of the sheet. We find that in a suitably chosen cosmological geometry, the above expression for s violates the covariant entropy bound. We consider different possible explanations for this fact, in particular, the possibility that entropy bounds should be defined in terms of volumes of regions rather than areas of surfaces.

  7. Covariance Analysis of Gamma Ray Spectra

    SciTech Connect

    Trainham, R.; Tinsley, J.

    2013-01-01

    The covariance method exploits fluctuations in signals to recover information encoded in correlations which are usually lost when signal averaging occurs. In nuclear spectroscopy it can be regarded as a generalization of the coincidence technique. The method can be used to extract signal from uncorrelated noise, to separate overlapping spectral peaks, to identify escape peaks, to reconstruct spectra from Compton continua, and to generate secondary spectral fingerprints. We discuss a few statistical considerations of the covariance method and present experimental examples of its use in gamma spectroscopy.

  8. Covariance analysis of gamma ray spectra

    SciTech Connect

    Trainham, R.; Tinsley, J.

    2013-01-15

    The covariance method exploits fluctuations in signals to recover information encoded in correlations which are usually lost when signal averaging occurs. In nuclear spectroscopy it can be regarded as a generalization of the coincidence technique. The method can be used to extract signal from uncorrelated noise, to separate overlapping spectral peaks, to identify escape peaks, to reconstruct spectra from Compton continua, and to generate secondary spectral fingerprints. We discuss a few statistical considerations of the covariance method and present experimental examples of its use in gamma spectroscopy.

  9. Sparse Multivariate Regression With Covariance Estimation

    PubMed Central

    Rothman, Adam J.; Levina, Elizaveta; Zhu, Ji

    2014-01-01

    We propose a procedure for constructing a sparse estimator of a multivariate regression coefficient matrix that accounts for correlation of the response variables. This method, which we call multivariate regression with covariance estimation (MRCE), involves penalized likelihood with simultaneous estimation of the regression coefficients and the covariance structure. An efficient optimization algorithm and a fast approximation are developed for computing MRCE. Using simulation studies, we show that the proposed method outperforms relevant competitors when the responses are highly correlated. We also apply the new method to a finance example on predicting asset returns. An R-package containing this dataset and code for computing MRCE and its approximation are available online. PMID:24963268

  10. Parametric number covariance in quantum chaotic spectra

    NASA Astrophysics Data System (ADS)

    Vinayak, Kumar, Sandeep; Pandey, Akhilesh

    2016-03-01

    We study spectral parametric correlations in quantum chaotic systems and introduce the number covariance as a measure of such correlations. We derive analytic results for the classical random matrix ensembles using the binary correlation method and obtain compact expressions for the covariance. We illustrate the universality of this measure by presenting the spectral analysis of the quantum kicked rotors for the time-reversal invariant and time-reversal noninvariant cases. A local version of the parametric number variance introduced earlier is also investigated.