An Empirical Study of Eight Nonparametric Tests in Hierarchical Regression.
ERIC Educational Resources Information Center
Harwell, Michael; Serlin, Ronald C.
When normality does not hold, nonparametric tests represent an important data-analytic alternative to parametric tests. However, the use of nonparametric tests in educational research has been limited by the absence of easily performed tests for complex experimental designs and analyses, such as factorial designs and multiple regression analyses,…
Theodorsson-Norheim, E
1986-08-01
Multiple t tests at a fixed p level are frequently used to analyse biomedical data where analysis of variance followed by multiple comparisons or the adjustment of the p values according to Bonferroni would be more appropriate. The Kruskal-Wallis test is a nonparametric 'analysis of variance' which may be used to compare several independent samples. The present program is written in an elementary subset of BASIC and will perform Kruskal-Wallis test followed by multiple comparisons between the groups on practically any computer programmable in BASIC.
A SAS(®) macro implementation of a multiple comparison post hoc test for a Kruskal-Wallis analysis.
Elliott, Alan C; Hynan, Linda S
2011-04-01
The Kruskal-Wallis (KW) nonparametric analysis of variance is often used instead of a standard one-way ANOVA when data are from a suspected non-normal population. The KW omnibus procedure tests for some differences between groups, but provides no specific post hoc pair wise comparisons. This paper provides a SAS(®) macro implementation of a multiple comparison test based on significant Kruskal-Wallis results from the SAS NPAR1WAY procedure. The implementation is designed for up to 20 groups at a user-specified alpha significance level. A Monte-Carlo simulation compared this nonparametric procedure to commonly used parametric multiple comparison tests. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
An entropy-based nonparametric test for the validation of surrogate endpoints.
Miao, Xiaopeng; Wang, Yong-Cheng; Gangopadhyay, Ashis
2012-06-30
We present a nonparametric test to validate surrogate endpoints based on measure of divergence and random permutation. This test is a proposal to directly verify the Prentice statistical definition of surrogacy. The test does not impose distributional assumptions on the endpoints, and it is robust to model misspecification. Our simulation study shows that the proposed nonparametric test outperforms the practical test of the Prentice criterion in terms of both robustness of size and power. We also evaluate the performance of three leading methods that attempt to quantify the effect of surrogate endpoints. The proposed method is applied to validate magnetic resonance imaging lesions as the surrogate endpoint for clinical relapses in a multiple sclerosis trial. Copyright © 2012 John Wiley & Sons, Ltd.
A Nonparametric Approach for Assessing Goodness-of-Fit of IRT Models in a Mixed Format Test
ERIC Educational Resources Information Center
Liang, Tie; Wells, Craig S.
2015-01-01
Investigating the fit of a parametric model plays a vital role in validating an item response theory (IRT) model. An area that has received little attention is the assessment of multiple IRT models used in a mixed-format test. The present study extends the nonparametric approach, proposed by Douglas and Cohen (2001), to assess model fit of three…
A Powerful Test for Comparing Multiple Regression Functions.
Maity, Arnab
2012-09-01
In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Lee, Minjung; Dignam, James J.; Han, Junhee
2014-01-01
We propose a nonparametric approach for cumulative incidence estimation when causes of failure are unknown or missing for some subjects. Under the missing at random assumption, we estimate the cumulative incidence function using multiple imputation methods. We develop asymptotic theory for the cumulative incidence estimators obtained from multiple imputation methods. We also discuss how to construct confidence intervals for the cumulative incidence function and perform a test for comparing the cumulative incidence functions in two samples with missing cause of failure. Through simulation studies, we show that the proposed methods perform well. The methods are illustrated with data from a randomized clinical trial in early stage breast cancer. PMID:25043107
Power calculation for comparing diagnostic accuracies in a multi-reader, multi-test design.
Kim, Eunhee; Zhang, Zheng; Wang, Youdan; Zeng, Donglin
2014-12-01
Receiver operating characteristic (ROC) analysis is widely used to evaluate the performance of diagnostic tests with continuous or ordinal responses. A popular study design for assessing the accuracy of diagnostic tests involves multiple readers interpreting multiple diagnostic test results, called the multi-reader, multi-test design. Although several different approaches to analyzing data from this design exist, few methods have discussed the sample size and power issues. In this article, we develop a power formula to compare the correlated areas under the ROC curves (AUC) in a multi-reader, multi-test design. We present a nonparametric approach to estimate and compare the correlated AUCs by extending DeLong et al.'s (1988, Biometrics 44, 837-845) approach. A power formula is derived based on the asymptotic distribution of the nonparametric AUCs. Simulation studies are conducted to demonstrate the performance of the proposed power formula and an example is provided to illustrate the proposed procedure. © 2014, The International Biometric Society.
Supratentorial lesions contribute to trigeminal neuralgia in multiple sclerosis.
Fröhlich, Kilian; Winder, Klemens; Linker, Ralf A; Engelhorn, Tobias; Dörfler, Arnd; Lee, De-Hyung; Hilz, Max J; Schwab, Stefan; Seifert, Frank
2018-06-01
Background It has been proposed that multiple sclerosis lesions afflicting the pontine trigeminal afferents contribute to trigeminal neuralgia in multiple sclerosis. So far, there are no imaging studies that have evaluated interactions between supratentorial lesions and trigeminal neuralgia in multiple sclerosis patients. Methods We conducted a retrospective study and sought multiple sclerosis patients with trigeminal neuralgia and controls in a local database. Multiple sclerosis lesions were manually outlined and transformed into stereotaxic space. We determined the lesion overlap and performed a voxel-wise subtraction analysis. Secondly, we conducted a voxel-wise non-parametric analysis using the Liebermeister test. Results From 12,210 multiple sclerosis patient records screened, we identified 41 patients with trigeminal neuralgia. The voxel-wise subtraction analysis yielded associations between trigeminal neuralgia and multiple sclerosis lesions in the pontine trigeminal afferents, as well as larger supratentorial lesion clusters in the contralateral insula and hippocampus. The non-parametric statistical analysis using the Liebermeister test yielded similar areas to be associated with multiple sclerosis-related trigeminal neuralgia. Conclusions Our study confirms previous data on associations between multiple sclerosis-related trigeminal neuralgia and pontine lesions, and showed for the first time an association with lesions in the insular region, a region involved in pain processing and endogenous pain modulation.
Multiple Hypothesis Testing for Experimental Gingivitis Based on Wilcoxon Signed Rank Statistics
Preisser, John S.; Sen, Pranab K.; Offenbacher, Steven
2011-01-01
Dental research often involves repeated multivariate outcomes on a small number of subjects for which there is interest in identifying outcomes that exhibit change in their levels over time as well as to characterize the nature of that change. In particular, periodontal research often involves the analysis of molecular mediators of inflammation for which multivariate parametric methods are highly sensitive to outliers and deviations from Gaussian assumptions. In such settings, nonparametric methods may be favored over parametric ones. Additionally, there is a need for statistical methods that control an overall error rate for multiple hypothesis testing. We review univariate and multivariate nonparametric hypothesis tests and apply them to longitudinal data to assess changes over time in 31 biomarkers measured from the gingival crevicular fluid in 22 subjects whereby gingivitis was induced by temporarily withholding tooth brushing. To identify biomarkers that can be induced to change, multivariate Wilcoxon signed rank tests for a set of four summary measures based upon area under the curve are applied for each biomarker and compared to their univariate counterparts. Multiple hypothesis testing methods with choice of control of the false discovery rate or strong control of the family-wise error rate are examined. PMID:21984957
Non-parametric combination and related permutation tests for neuroimaging.
Winkler, Anderson M; Webster, Matthew A; Brooks, Jonathan C; Tracey, Irene; Smith, Stephen M; Nichols, Thomas E
2016-04-01
In this work, we show how permutation methods can be applied to combination analyses such as those that include multiple imaging modalities, multiple data acquisitions of the same modality, or simply multiple hypotheses on the same data. Using the well-known definition of union-intersection tests and closed testing procedures, we use synchronized permutations to correct for such multiplicity of tests, allowing flexibility to integrate imaging data with different spatial resolutions, surface and/or volume-based representations of the brain, including non-imaging data. For the problem of joint inference, we propose and evaluate a modification of the recently introduced non-parametric combination (NPC) methodology, such that instead of a two-phase algorithm and large data storage requirements, the inference can be performed in a single phase, with reasonable computational demands. The method compares favorably to classical multivariate tests (such as MANCOVA), even when the latter is assessed using permutations. We also evaluate, in the context of permutation tests, various combining methods that have been proposed in the past decades, and identify those that provide the best control over error rate and power across a range of situations. We show that one of these, the method of Tippett, provides a link between correction for the multiplicity of tests and their combination. Finally, we discuss how the correction can solve certain problems of multiple comparisons in one-way ANOVA designs, and how the combination is distinguished from conjunctions, even though both can be assessed using permutation tests. We also provide a common algorithm that accommodates combination and correction. © 2016 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
Biostatistics Series Module 3: Comparing Groups: Numerical Variables.
Hazra, Avijit; Gogtay, Nithya
2016-01-01
Numerical data that are normally distributed can be analyzed with parametric tests, that is, tests which are based on the parameters that define a normal distribution curve. If the distribution is uncertain, the data can be plotted as a normal probability plot and visually inspected, or tested for normality using one of a number of goodness of fit tests, such as the Kolmogorov-Smirnov test. The widely used Student's t-test has three variants. The one-sample t-test is used to assess if a sample mean (as an estimate of the population mean) differs significantly from a given population mean. The means of two independent samples may be compared for a statistically significant difference by the unpaired or independent samples t-test. If the data sets are related in some way, their means may be compared by the paired or dependent samples t-test. The t-test should not be used to compare the means of more than two groups. Although it is possible to compare groups in pairs, when there are more than two groups, this will increase the probability of a Type I error. The one-way analysis of variance (ANOVA) is employed to compare the means of three or more independent data sets that are normally distributed. Multiple measurements from the same set of subjects cannot be treated as separate, unrelated data sets. Comparison of means in such a situation requires repeated measures ANOVA. It is to be noted that while a multiple group comparison test such as ANOVA can point to a significant difference, it does not identify exactly between which two groups the difference lies. To do this, multiple group comparison needs to be followed up by an appropriate post hoc test. An example is the Tukey's honestly significant difference test following ANOVA. If the assumptions for parametric tests are not met, there are nonparametric alternatives for comparing data sets. These include Mann-Whitney U-test as the nonparametric counterpart of the unpaired Student's t-test, Wilcoxon signed-rank test as the counterpart of the paired Student's t-test, Kruskal-Wallis test as the nonparametric equivalent of ANOVA and the Friedman's test as the counterpart of repeated measures ANOVA.
Pataky, Todd C; Vanrenterghem, Jos; Robinson, Mark A
2015-05-01
Biomechanical processes are often manifested as one-dimensional (1D) trajectories. It has been shown that 1D confidence intervals (CIs) are biased when based on 0D statistical procedures, and the non-parametric 1D bootstrap CI has emerged in the Biomechanics literature as a viable solution. The primary purpose of this paper was to clarify that, for 1D biomechanics datasets, the distinction between 0D and 1D methods is much more important than the distinction between parametric and non-parametric procedures. A secondary purpose was to demonstrate that a parametric equivalent to the 1D bootstrap exists in the form of a random field theory (RFT) correction for multiple comparisons. To emphasize these points we analyzed six datasets consisting of force and kinematic trajectories in one-sample, paired, two-sample and regression designs. Results showed, first, that the 1D bootstrap and other 1D non-parametric CIs were qualitatively identical to RFT CIs, and all were very different from 0D CIs. Second, 1D parametric and 1D non-parametric hypothesis testing results were qualitatively identical for all six datasets. Last, we highlight the limitations of 1D CIs by demonstrating that they are complex, design-dependent, and thus non-generalizable. These results suggest that (i) analyses of 1D data based on 0D models of randomness are generally biased unless one explicitly identifies 0D variables before the experiment, and (ii) parametric and non-parametric 1D hypothesis testing provide an unambiguous framework for analysis when one׳s hypothesis explicitly or implicitly pertains to whole 1D trajectories. Copyright © 2015 Elsevier Ltd. All rights reserved.
Liu, Yuewei; Chen, Weihong
2012-02-01
As a nonparametric method, the Kruskal-Wallis test is widely used to compare three or more independent groups when an ordinal or interval level of data is available, especially when the assumptions of analysis of variance (ANOVA) are not met. If the Kruskal-Wallis statistic is statistically significant, Nemenyi test is an alternative method for further pairwise multiple comparisons to locate the source of significance. Unfortunately, most popular statistical packages do not integrate the Nemenyi test, which is not easy to be calculated by hand. We described the theory and applications of the Kruskal-Wallis and Nemenyi tests, and presented a flexible SAS macro to implement the two tests. The SAS macro was demonstrated by two examples from our cohort study in occupational epidemiology. It provides a useful tool for SAS users to test the differences among three or more independent groups using a nonparametric method.
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
kruX: matrix-based non-parametric eQTL discovery.
Qi, Jianlong; Asl, Hassan Foroughi; Björkegren, Johan; Michoel, Tom
2014-01-14
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com.
The chi-square test of independence.
McHugh, Mary L
2013-01-01
The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others. The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer's V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer's V to produce relative low correlation measures, even for highly significant results.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Wang, Yuanjia; Chen, Huaihou
2012-01-01
Summary We examine a generalized F-test of a nonparametric function through penalized splines and a linear mixed effects model representation. With a mixed effects model representation of penalized splines, we imbed the test of an unspecified function into a test of some fixed effects and a variance component in a linear mixed effects model with nuisance variance components under the null. The procedure can be used to test a nonparametric function or varying-coefficient with clustered data, compare two spline functions, test the significance of an unspecified function in an additive model with multiple components, and test a row or a column effect in a two-way analysis of variance model. Through a spectral decomposition of the residual sum of squares, we provide a fast algorithm for computing the null distribution of the test, which significantly improves the computational efficiency over bootstrap. The spectral representation reveals a connection between the likelihood ratio test (LRT) in a multiple variance components model and a single component model. We examine our methods through simulations, where we show that the power of the generalized F-test may be higher than the LRT, depending on the hypothesis of interest and the true model under the alternative. We apply these methods to compute the genome-wide critical value and p-value of a genetic association test in a genome-wide association study (GWAS), where the usual bootstrap is computationally intensive (up to 108 simulations) and asymptotic approximation may be unreliable and conservative. PMID:23020801
Wang, Yuanjia; Chen, Huaihou
2012-12-01
We examine a generalized F-test of a nonparametric function through penalized splines and a linear mixed effects model representation. With a mixed effects model representation of penalized splines, we imbed the test of an unspecified function into a test of some fixed effects and a variance component in a linear mixed effects model with nuisance variance components under the null. The procedure can be used to test a nonparametric function or varying-coefficient with clustered data, compare two spline functions, test the significance of an unspecified function in an additive model with multiple components, and test a row or a column effect in a two-way analysis of variance model. Through a spectral decomposition of the residual sum of squares, we provide a fast algorithm for computing the null distribution of the test, which significantly improves the computational efficiency over bootstrap. The spectral representation reveals a connection between the likelihood ratio test (LRT) in a multiple variance components model and a single component model. We examine our methods through simulations, where we show that the power of the generalized F-test may be higher than the LRT, depending on the hypothesis of interest and the true model under the alternative. We apply these methods to compute the genome-wide critical value and p-value of a genetic association test in a genome-wide association study (GWAS), where the usual bootstrap is computationally intensive (up to 10(8) simulations) and asymptotic approximation may be unreliable and conservative. © 2012, The International Biometric Society.
On sample size of the kruskal-wallis test with application to a mouse peritoneal cavity study.
Fan, Chunpeng; Zhang, Donghui; Zhang, Cun-Hui
2011-03-01
As the nonparametric generalization of the one-way analysis of variance model, the Kruskal-Wallis test applies when the goal is to test the difference between multiple samples and the underlying population distributions are nonnormal or unknown. Although the Kruskal-Wallis test has been widely used for data analysis, power and sample size methods for this test have been investigated to a much lesser extent. This article proposes new power and sample size calculation methods for the Kruskal-Wallis test based on the pilot study in either a completely nonparametric model or a semiparametric location model. No assumption is made on the shape of the underlying population distributions. Simulation results show that, in terms of sample size calculation for the Kruskal-Wallis test, the proposed methods are more reliable and preferable to some more traditional methods. A mouse peritoneal cavity study is used to demonstrate the application of the methods. © 2010, The International Biometric Society.
Nonparametric estimation and testing of fixed effects panel data models
Henderson, Daniel J.; Carroll, Raymond J.; Li, Qi
2009-01-01
In this paper we consider the problem of estimating nonparametric panel data models with fixed effects. We introduce an iterative nonparametric kernel estimator. We also extend the estimation method to the case of a semiparametric partially linear fixed effects model. To determine whether a parametric, semiparametric or nonparametric model is appropriate, we propose test statistics to test between the three alternatives in practice. We further propose a test statistic for testing the null hypothesis of random effects against fixed effects in a nonparametric panel data regression model. Simulations are used to examine the finite sample performance of the proposed estimators and the test statistics. PMID:19444335
Parametric, nonparametric and parametric modelling of a chaotic circuit time series
NASA Astrophysics Data System (ADS)
Timmer, J.; Rust, H.; Horbelt, W.; Voss, H. U.
2000-09-01
The determination of a differential equation underlying a measured time series is a frequently arising task in nonlinear time series analysis. In the validation of a proposed model one often faces the dilemma that it is hard to decide whether possible discrepancies between the time series and model output are caused by an inappropriate model or by bad estimates of parameters in a correct type of model, or both. We propose a combination of parametric modelling based on Bock's multiple shooting algorithm and nonparametric modelling based on optimal transformations as a strategy to test proposed models and if rejected suggest and test new ones. We exemplify this strategy on an experimental time series from a chaotic circuit where we obtain an extremely accurate reconstruction of the observed attractor.
kruX: matrix-based non-parametric eQTL discovery
2014-01-01
Background The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. Results We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. Conclusion kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com. PMID:24423115
Testing jumps via false discovery rate control.
Yen, Yu-Min
2013-01-01
Many recently developed nonparametric jump tests can be viewed as multiple hypothesis testing problems. For such multiple hypothesis tests, it is well known that controlling type I error often makes a large proportion of erroneous rejections, and such situation becomes even worse when the jump occurrence is a rare event. To obtain more reliable results, we aim to control the false discovery rate (FDR), an efficient compound error measure for erroneous rejections in multiple testing problems. We perform the test via the Barndorff-Nielsen and Shephard (BNS) test statistic, and control the FDR with the Benjamini and Hochberg (BH) procedure. We provide asymptotic results for the FDR control. From simulations, we examine relevant theoretical results and demonstrate the advantages of controlling the FDR. The hybrid approach is then applied to empirical analysis on two benchmark stock indices with high frequency data.
Linkage mapping of beta 2 EEG waves via non-parametric regression.
Ghosh, Saurabh; Begleiter, Henri; Porjesz, Bernice; Chorlian, David B; Edenberg, Howard J; Foroud, Tatiana; Goate, Alison; Reich, Theodore
2003-04-01
Parametric linkage methods for analyzing quantitative trait loci are sensitive to violations in trait distributional assumptions. Non-parametric methods are relatively more robust. In this article, we modify the non-parametric regression procedure proposed by Ghosh and Majumder [2000: Am J Hum Genet 66:1046-1061] to map Beta 2 EEG waves using genome-wide data generated in the COGA project. Significant linkage findings are obtained on chromosomes 1, 4, 5, and 15 with findings at multiple regions on chromosomes 4 and 15. We analyze the data both with and without incorporating alcoholism as a covariate. We also test for epistatic interactions between regions of the genome exhibiting significant linkage with the EEG phenotypes and find evidence of epistatic interactions between a region each on chromosome 1 and chromosome 4 with one region on chromosome 15. While regressing out the effect of alcoholism does not affect the linkage findings, the epistatic interactions become statistically insignificant. Copyright 2003 Wiley-Liss, Inc.
Estimating scaled treatment effects with multiple outcomes.
Kennedy, Edward H; Kangovi, Shreya; Mitra, Nandita
2017-01-01
In classical study designs, the aim is often to learn about the effects of a treatment or intervention on a single outcome; in many modern studies, however, data on multiple outcomes are collected and it is of interest to explore effects on multiple outcomes simultaneously. Such designs can be particularly useful in patient-centered research, where different outcomes might be more or less important to different patients. In this paper, we propose scaled effect measures (via potential outcomes) that translate effects on multiple outcomes to a common scale, using mean-variance and median-interquartile range based standardizations. We present efficient, nonparametric, doubly robust methods for estimating these scaled effects (and weighted average summary measures), and for testing the null hypothesis that treatment affects all outcomes equally. We also discuss methods for exploring how treatment effects depend on covariates (i.e., effect modification). In addition to describing efficiency theory for our estimands and the asymptotic behavior of our estimators, we illustrate the methods in a simulation study and a data analysis. Importantly, and in contrast to much of the literature concerning effects on multiple outcomes, our methods are nonparametric and can be used not only in randomized trials to yield increased efficiency, but also in observational studies with high-dimensional covariates to reduce confounding bias.
Le Bihan, Nicolas; Margerin, Ludovic
2009-07-01
In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.
Further evidence for the increased power of LOD scores compared with nonparametric methods.
Durner, M; Vieland, V J; Greenberg, D A
1999-01-01
In genetic analysis of diseases in which the underlying model is unknown, "model free" methods-such as affected sib pair (ASP) tests-are often preferred over LOD-score methods, although LOD-score methods under the correct or even approximately correct model are more powerful than ASP tests. However, there might be circumstances in which nonparametric methods will outperform LOD-score methods. Recently, Dizier et al. reported that, in some complex two-locus (2L) models, LOD-score methods with segregation analysis-derived parameters had less power to detect linkage than ASP tests. We investigated whether these particular models, in fact, represent a situation that ASP tests are more powerful than LOD scores. We simulated data according to the parameters specified by Dizier et al. and analyzed the data by using a (a) single locus (SL) LOD-score analysis performed twice, under a simple dominant and a recessive mode of inheritance (MOI), (b) ASP methods, and (c) nonparametric linkage (NPL) analysis. We show that SL analysis performed twice and corrected for the type I-error increase due to multiple testing yields almost as much linkage information as does an analysis under the correct 2L model and is more powerful than either the ASP method or the NPL method. We demonstrate that, even for complex genetic models, the most important condition for linkage analysis is that the assumed MOI at the disease locus being tested is approximately correct, not that the inheritance of the disease per se is correctly specified. In the analysis by Dizier et al., segregation analysis led to estimates of dominance parameters that were grossly misspecified for the locus tested in those models in which ASP tests appeared to be more powerful than LOD-score analyses.
An appraisal of statistical procedures used in derivation of reference intervals.
Ichihara, Kiyoshi; Boyd, James C
2010-11-01
When conducting studies to derive reference intervals (RIs), various statistical procedures are commonly applied at each step, from the planning stages to final computation of RIs. Determination of the necessary sample size is an important consideration, and evaluation of at least 400 individuals in each subgroup has been recommended to establish reliable common RIs in multicenter studies. Multiple regression analysis allows identification of the most important factors contributing to variation in test results, while accounting for possible confounding relationships among these factors. Of the various approaches proposed for judging the necessity of partitioning reference values, nested analysis of variance (ANOVA) is the likely method of choice owing to its ability to handle multiple groups and being able to adjust for multiple factors. Box-Cox power transformation often has been used to transform data to a Gaussian distribution for parametric computation of RIs. However, this transformation occasionally fails. Therefore, the non-parametric method based on determination of the 2.5 and 97.5 percentiles following sorting of the data, has been recommended for general use. The performance of the Box-Cox transformation can be improved by introducing an additional parameter representing the origin of transformation. In simulations, the confidence intervals (CIs) of reference limits (RLs) calculated by the parametric method were narrower than those calculated by the non-parametric approach. However, the margin of difference was rather small owing to additional variability in parametrically-determined RLs introduced by estimation of parameters for the Box-Cox transformation. The parametric calculation method may have an advantage over the non-parametric method in allowing identification and exclusion of extreme values during RI computation.
Bansal, Ravi; Peterson, Bradley S
2018-06-01
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
A Nonparametric Framework for Comparing Trends and Gaps across Tests
ERIC Educational Resources Information Center
Ho, Andrew Dean
2009-01-01
Problems of scale typically arise when comparing test score trends, gaps, and gap trends across different tests. To overcome some of these difficulties, test score distributions on the same score scale can be represented by nonparametric graphs or statistics that are invariant under monotone scale transformations. This article motivates and then…
A Note on the Assumption of Identical Distributions for Nonparametric Tests of Location
ERIC Educational Resources Information Center
Nordstokke, David W.; Colp, S. Mitchell
2018-01-01
Often, when testing for shift in location, researchers will utilize nonparametric statistical tests in place of their parametric counterparts when there is evidence or belief that the assumptions of the parametric test are not met (i.e., normally distributed dependent variables). An underlying and often unattended to assumption of nonparametric…
Nikita, Efthymia
2014-03-01
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data that code an underlying continuous variable, like entheseal changes. The analysis of artificial data and ordinal data expressing entheseal changes in archaeological North African populations gave the following results. Parametric and nonparametric tests give convergent results particularly for P values <0.1, irrespective of whether the underlying variable is normally distributed or not under the condition that the samples involved in the tests exhibit approximately equal sizes. If this prerequisite is valid and provided that the samples are of equal variances, analysis of covariance may be adopted. GLM are not subject to constraints and give results that converge to those obtained from all nonparametric tests. Therefore, they can be used instead of traditional tests as they give the same amount of information as them, but with the advantage of allowing the study of the simultaneous impact of multiple predictors and their interactions and the modeling of the experimental data. However, GLM should be replaced by GEE for the study of bilateral asymmetry and in general when paired samples are tested, because GEE are appropriate for correlated data. Copyright © 2013 Wiley Periodicals, Inc.
Konietschke, Frank; Libiger, Ondrej; Hothorn, Ludwig A
2012-01-01
Statistical association between a single nucleotide polymorphism (SNP) genotype and a quantitative trait in genome-wide association studies is usually assessed using a linear regression model, or, in the case of non-normally distributed trait values, using the Kruskal-Wallis test. While linear regression models assume an additive mode of inheritance via equi-distant genotype scores, Kruskal-Wallis test merely tests global differences in trait values associated with the three genotype groups. Both approaches thus exhibit suboptimal power when the underlying inheritance mode is dominant or recessive. Furthermore, these tests do not perform well in the common situations when only a few trait values are available in a rare genotype category (disbalance), or when the values associated with the three genotype categories exhibit unequal variance (variance heterogeneity). We propose a maximum test based on Marcus-type multiple contrast test for relative effect sizes. This test allows model-specific testing of either dominant, additive or recessive mode of inheritance, and it is robust against variance heterogeneity. We show how to obtain mode-specific simultaneous confidence intervals for the relative effect sizes to aid in interpreting the biological relevance of the results. Further, we discuss the use of a related all-pairwise comparisons contrast test with range preserving confidence intervals as an alternative to Kruskal-Wallis heterogeneity test. We applied the proposed maximum test to the Bogalusa Heart Study dataset, and gained a remarkable increase in the power to detect association, particularly for rare genotypes. Our simulation study also demonstrated that the proposed non-parametric tests control family-wise error rate in the presence of non-normality and variance heterogeneity contrary to the standard parametric approaches. We provide a publicly available R library nparcomp that can be used to estimate simultaneous confidence intervals or compatible multiplicity-adjusted p-values associated with the proposed maximum test.
Chaibub Neto, Elias
2015-01-01
In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965
Methods of analysis speech rate: a pilot study.
Costa, Luanna Maria Oliveira; Martins-Reis, Vanessa de Oliveira; Celeste, Letícia Côrrea
2016-01-01
To describe the performance of fluent adults in different measures of speech rate. The study included 24 fluent adults, of both genders, speakers of Brazilian Portuguese, who were born and still living in the metropolitan region of Belo Horizonte, state of Minas Gerais, aged between 18 and 59 years. Participants were grouped by age: G1 (18-29 years), G2 (30-39 years), G3 (40-49 years), and G4 (50-59 years). The speech samples were obtained following the methodology of the Speech Fluency Assessment Protocol. In addition to the measures of speech rate proposed by the protocol (speech rate in words and syllables per minute), the rate of speech into phonemes per second and the articulation rate with and without the disfluencies were calculated. We used the nonparametric Friedman test and the Wilcoxon test for multiple comparisons. Groups were compared using the nonparametric Kruskal Wallis. The significance level was of 5%. There were significant differences between measures of speech rate involving syllables. The multiple comparisons showed that all the three measures were different. There was no effect of age for the studied measures. These findings corroborate previous studies. The inclusion of temporal acoustic measures such as speech rate in phonemes per second and articulation rates with and without disfluencies can be a complementary approach in the evaluation of speech rate.
A Nonparametric K-Sample Test for Equality of Slopes.
ERIC Educational Resources Information Center
Penfield, Douglas A.; Koffler, Stephen L.
1986-01-01
The development of a nonparametric K-sample test for equality of slopes using Puri's generalized L statistic is presented. The test is recommended when the assumptions underlying the parametric model are violated. This procedure replaces original data with either ranks (for data with heavy tails) or normal scores (for data with light tails).…
A New Nonparametric Levene Test for Equal Variances
ERIC Educational Resources Information Center
Nordstokke, David W.; Zumbo, Bruno D.
2010-01-01
Tests of the equality of variances are sometimes used on their own to compare variability across groups of experimental or non-experimental conditions but they are most often used alongside other methods to support assumptions made about variances. A new nonparametric test of equality of variances is described and compared to current "gold…
Conditional Covariance-Based Nonparametric Multidimensionality Assessment.
ERIC Educational Resources Information Center
Stout, William; And Others
1996-01-01
Three nonparametric procedures that use estimates of covariances of item-pair responses conditioned on examinee trait level for assessing dimensionality of a test are described. The HCA/CCPROX, DIMTEST, and DETECT are applied to a dimensionality study of the Law School Admission Test. (SLD)
ERIC Educational Resources Information Center
Bakir, Saad T.
2010-01-01
We propose a nonparametric (or distribution-free) procedure for testing the equality of several population variances (or scale parameters). The proposed test is a modification of Bakir's (1989, Commun. Statist., Simul-Comp., 18, 757-775) analysis of means by ranks (ANOMR) procedure for testing the equality of several population means. A proof is…
Sensor Compromise Detection in Multiple-Target Tracking Systems
Doucette, Emily A.; Curtis, Jess W.
2018-01-01
Tracking multiple targets using a single estimator is a problem that is commonly approached within a trusted framework. There are many weaknesses that an adversary can exploit if it gains control over the sensors. Because the number of targets that the estimator has to track is not known with anticipation, an adversary could cause a loss of information or a degradation in the tracking precision. Other concerns include the introduction of false targets, which would result in a waste of computational and material resources, depending on the application. In this work, we study the problem of detecting compromised or faulty sensors in a multiple-target tracker, starting with the single-sensor case and then considering the multiple-sensor scenario. We propose an algorithm to detect a variety of attacks in the multiple-sensor case, via the application of finite set statistics (FISST), one-class classifiers and hypothesis testing using nonparametric techniques. PMID:29466314
Packham, B; Barnes, G; Dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D
2016-06-01
Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p < 0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity.
Packham, B; Barnes, G; dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D
2016-01-01
Abstract Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p < 0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity. PMID:27203477
A Bayesian Nonparametric Approach to Test Equating
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2009-01-01
A Bayesian nonparametric model is introduced for score equating. It is applicable to all major equating designs, and has advantages over previous equating models. Unlike the previous models, the Bayesian model accounts for positive dependence between distributions of scores from two tests. The Bayesian model and the previous equating models are…
A Simulation Comparison of Parametric and Nonparametric Dimensionality Detection Procedures
ERIC Educational Resources Information Center
Mroch, Andrew A.; Bolt, Daniel M.
2006-01-01
Recently, nonparametric methods have been proposed that provide a dimensionally based description of test structure for tests with dichotomous items. Because such methods are based on different notions of dimensionality than are assumed when using a psychometric model, it remains unclear whether these procedures might lead to a different…
TRAN-STAT: statistics for environmental transuranic studies, July 1978, Number 5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
This issue is concerned with nonparametric procedures for (1) estimating the central tendency of a population, (2) describing data sets through estimating percentiles, (3) estimating confidence limits for the median and other percentiles, (4) estimating tolerance limits and associated numbers of samples, and (5) tests of significance and associated procedures for a variety of testing situations (counterparts to t-tests and analysis of variance). Some characteristics of several nonparametric tests are illustrated using the NAEG /sup 241/Am aliquot data presented and discussed in the April issue of TRAN-STAT. Some of the statistical terms used here are defined in a glossary. Themore » reference list also includes short descriptions of nonparametric books. 31 references, 3 figures, 1 table.« less
ERIC Educational Resources Information Center
Kantabutra, Sangchan
2009-01-01
This paper examines urban-rural effects on public upper-secondary school efficiency in northern Thailand. In the study, efficiency was measured by a nonparametric technique, data envelopment analysis (DEA). Urban-rural effects were examined through a Mann-Whitney nonparametric statistical test. Results indicate that urban schools appear to have…
Hoover, D R; Peng, Y; Saah, A J; Detels, R R; Day, R S; Phair, J P
A simple non-parametric approach is developed to simultaneously estimate net incidence and morbidity time from specific AIDS illnesses in populations at high risk for death from these illnesses and other causes. The disease-death process has four-stages that can be recast as two sandwiching three-state multiple decrement processes. Non-parametric estimation of net incidence and morbidity time with error bounds are achieved from these sandwiching models through modification of methods from Aalen and Greenwood, and bootstrapping. An application to immunosuppressed HIV-1 infected homosexual men reveals that cytomegalovirus disease, Kaposi's sarcoma and Pneumocystis pneumonia are likely to occur and cause significant morbidity time.
cit: hypothesis testing software for mediation analysis in genomic applications.
Millstein, Joshua; Chen, Gary K; Breton, Carrie V
2016-08-01
The challenges of successfully applying causal inference methods include: (i) satisfying underlying assumptions, (ii) limitations in data/models accommodated by the software and (iii) low power of common multiple testing approaches. The causal inference test (CIT) is based on hypothesis testing rather than estimation, allowing the testable assumptions to be evaluated in the determination of statistical significance. A user-friendly software package provides P-values and optionally permutation-based FDR estimates (q-values) for potential mediators. It can handle single and multiple binary and continuous instrumental variables, binary or continuous outcome variables and adjustment covariates. Also, the permutation-based FDR option provides a non-parametric implementation. Simulation studies demonstrate the validity of the cit package and show a substantial advantage of permutation-based FDR over other common multiple testing strategies. The cit open-source R package is freely available from the CRAN website (https://cran.r-project.org/web/packages/cit/index.html) with embedded C ++ code that utilizes the GNU Scientific Library, also freely available (http://www.gnu.org/software/gsl/). joshua.millstein@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
ERIC Educational Resources Information Center
Sengul Avsar, Asiye; Tavsancil, Ezel
2017-01-01
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
How to Compare Parametric and Nonparametric Person-Fit Statistics Using Real Data
ERIC Educational Resources Information Center
Sinharay, Sandip
2017-01-01
Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…
Common Scientific and Statistical Errors in Obesity Research
George, Brandon J.; Beasley, T. Mark; Brown, Andrew W.; Dawson, John; Dimova, Rositsa; Divers, Jasmin; Goldsby, TaShauna U.; Heo, Moonseong; Kaiser, Kathryn A.; Keith, Scott; Kim, Mimi Y.; Li, Peng; Mehta, Tapan; Oakes, J. Michael; Skinner, Asheley; Stuart, Elizabeth; Allison, David B.
2015-01-01
We identify 10 common errors and problems in the statistical analysis, design, interpretation, and reporting of obesity research and discuss how they can be avoided. The 10 topics are: 1) misinterpretation of statistical significance, 2) inappropriate testing against baseline values, 3) excessive and undisclosed multiple testing and “p-value hacking,” 4) mishandling of clustering in cluster randomized trials, 5) misconceptions about nonparametric tests, 6) mishandling of missing data, 7) miscalculation of effect sizes, 8) ignoring regression to the mean, 9) ignoring confirmation bias, and 10) insufficient statistical reporting. We hope that discussion of these errors can improve the quality of obesity research by helping researchers to implement proper statistical practice and to know when to seek the help of a statistician. PMID:27028280
Comparison of parametric and bootstrap method in bioequivalence test.
Ahn, Byung-Jin; Yim, Dong-Seok
2009-10-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption.
Comparison of Parametric and Bootstrap Method in Bioequivalence Test
Ahn, Byung-Jin
2009-01-01
The estimation of 90% parametric confidence intervals (CIs) of mean AUC and Cmax ratios in bioequivalence (BE) tests are based upon the assumption that formulation effects in log-transformed data are normally distributed. To compare the parametric CIs with those obtained from nonparametric methods we performed repeated estimation of bootstrap-resampled datasets. The AUC and Cmax values from 3 archived datasets were used. BE tests on 1,000 resampled datasets from each archived dataset were performed using SAS (Enterprise Guide Ver.3). Bootstrap nonparametric 90% CIs of formulation effects were then compared with the parametric 90% CIs of the original datasets. The 90% CIs of formulation effects estimated from the 3 archived datasets were slightly different from nonparametric 90% CIs obtained from BE tests on resampled datasets. Histograms and density curves of formulation effects obtained from resampled datasets were similar to those of normal distribution. However, in 2 of 3 resampled log (AUC) datasets, the estimates of formulation effects did not follow the Gaussian distribution. Bias-corrected and accelerated (BCa) CIs, one of the nonparametric CIs of formulation effects, shifted outside the parametric 90% CIs of the archived datasets in these 2 non-normally distributed resampled log (AUC) datasets. Currently, the 80~125% rule based upon the parametric 90% CIs is widely accepted under the assumption of normally distributed formulation effects in log-transformed data. However, nonparametric CIs may be a better choice when data do not follow this assumption. PMID:19915699
ERIC Educational Resources Information Center
Cui, Zhongmin; Kolen, Michael J.
2008-01-01
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams…
Cox regression analysis with missing covariates via nonparametric multiple imputation.
Hsu, Chiu-Hsieh; Yu, Mandi
2018-01-01
We consider the situation of estimating Cox regression in which some covariates are subject to missing, and there exists additional information (including observed event time, censoring indicator and fully observed covariates) which may be predictive of the missing covariates. We propose to use two working regression models: one for predicting the missing covariates and the other for predicting the missing probabilities. For each missing covariate observation, these two working models are used to define a nearest neighbor imputing set. This set is then used to non-parametrically impute covariate values for the missing observation. Upon the completion of imputation, Cox regression is performed on the multiply imputed datasets to estimate the regression coefficients. In a simulation study, we compare the nonparametric multiple imputation approach with the augmented inverse probability weighted (AIPW) method, which directly incorporates the two working models into estimation of Cox regression, and the predictive mean matching imputation (PMM) method. We show that all approaches can reduce bias due to non-ignorable missing mechanism. The proposed nonparametric imputation method is robust to mis-specification of either one of the two working models and robust to mis-specification of the link function of the two working models. In contrast, the PMM method is sensitive to misspecification of the covariates included in imputation. The AIPW method is sensitive to the selection probability. We apply the approaches to a breast cancer dataset from Surveillance, Epidemiology and End Results (SEER) Program.
Lucyshyn, Joseph M; Fossett, Brenda; Bakeman, Roger; Cheremshynski, Christy; Miller, Lynn; Lohrmann, Sharon; Binnendyk, Lauren; Khan, Sophia; Chinn, Stephen; Kwon, Samantha; Irvin, Larry K
2015-12-01
The efficacy and consequential validity of an ecological approach to behavioral intervention with families of children with developmental disabilities was examined. The approach aimed to transform coercive into constructive parent-child interaction in family routines. Ten families participated, including 10 mothers and fathers and 10 children 3-8 years old with developmental disabilities. Thirty-six family routines were selected (2 to 4 per family). Dependent measures included child problem behavior, routine steps completed, and coercive and constructive parent-child interaction. For each family, a single case, multiple baseline design was employed with three phases: baseline, intervention, and follow-up. Visual analysis evaluated the functional relation between intervention and improvements in child behavior and routine participation. Nonparametric tests across families evaluated the statistical significance of these improvements. Sequential analyses within families and univariate analyses across families examined changes from baseline to intervention in the percentage and odds ratio of coercive and constructive parent-child interaction. Multiple baseline results documented functional or basic effects for 8 of 10 families. Nonparametric tests showed these changes to be significant. Follow-up showed durability at 11 to 24 months postintervention. Sequential analyses documented the transformation of coercive into constructive processes for 9 of 10 families. Univariate analyses across families showed significant improvements in 2- and 4-step coercive and constructive processes but not in odds ratio. Results offer evidence of the efficacy of the approach and consequential validity of the ecological unit of analysis, parent-child interaction in family routines. Future studies should improve efficiency, and outcomes for families experiencing family systems challenges.
Learning Circulant Sensing Kernels
2014-03-01
Furthermore, we test learning the circulant sensing matrix/operator and the nonparametric dictionary altogether and obtain even better performance. We...scale. Furthermore, we test learning the circulant sensing matrix/operator and the nonparametric dictionary altogether and obtain even better performance...matrices, Tropp et al.[28] de - scribes a random filter for acquiring a signal x̄; Haupt et al.[12] describes a channel estimation problem to identify a
NASA Astrophysics Data System (ADS)
Bugała, Artur; Bednarek, Karol; Kasprzyk, Leszek; Tomczewski, Andrzej
2017-10-01
The paper presents the most representative - from the three-year measurement time period - characteristics of daily and monthly electricity production from a photovoltaic conversion using modules installed in a fixed and 2-axis tracking construction. Results are presented for selected summer, autumn, spring and winter days. Analyzed measuring stand is located on the roof of the Faculty of Electrical Engineering Poznan University of Technology building. The basic parameters of the statistical analysis like mean value, standard deviation, skewness, kurtosis, median, range, or coefficient of variation were used. It was found that the asymmetry factor can be useful in the analysis of the daily electricity production from a photovoltaic conversion. In order to determine the repeatability of monthly electricity production, occurring between the summer, and summer and winter months, a non-parametric Mann-Whitney U test was used as a statistical solution. In order to analyze the repeatability of daily peak hours, describing the largest value of the hourly electricity production, a non-parametric Kruskal-Wallis test was applied as an extension of the Mann-Whitney U test. Based on the analysis of the electric energy distribution from a prepared monitoring system it was found that traditional forecasting methods of the electricity production from a photovoltaic conversion, like multiple regression models, should not be the preferred methods of the analysis.
Estimating survival of radio-tagged birds
Bunck, C.M.; Pollock, K.H.; Lebreton, J.-D.; North, P.M.
1993-01-01
Parametric and nonparametric methods for estimating survival of radio-tagged birds are described. The general assumptions of these methods are reviewed. An estimate based on the assumption of constant survival throughout the period is emphasized in the overview of parametric methods. Two nonparametric methods, the Kaplan-Meier estimate of the survival funcrion and the log rank test, are explained in detail The link between these nonparametric methods and traditional capture-recapture models is discussed aloag with considerations in designing studies that use telemetry techniques to estimate survival.
Extending the Distributed Lag Model framework to handle chemical mixtures.
Bello, Ghalib A; Arora, Manish; Austin, Christine; Horton, Megan K; Wright, Robert O; Gennings, Chris
2017-07-01
Distributed Lag Models (DLMs) are used in environmental health studies to analyze the time-delayed effect of an exposure on an outcome of interest. Given the increasing need for analytical tools for evaluation of the effects of exposure to multi-pollutant mixtures, this study attempts to extend the classical DLM framework to accommodate and evaluate multiple longitudinally observed exposures. We introduce 2 techniques for quantifying the time-varying mixture effect of multiple exposures on an outcome of interest. Lagged WQS, the first technique, is based on Weighted Quantile Sum (WQS) regression, a penalized regression method that estimates mixture effects using a weighted index. We also introduce Tree-based DLMs, a nonparametric alternative for assessment of lagged mixture effects. This technique is based on the Random Forest (RF) algorithm, a nonparametric, tree-based estimation technique that has shown excellent performance in a wide variety of domains. In a simulation study, we tested the feasibility of these techniques and evaluated their performance in comparison to standard methodology. Both methods exhibited relatively robust performance, accurately capturing pre-defined non-linear functional relationships in different simulation settings. Further, we applied these techniques to data on perinatal exposure to environmental metal toxicants, with the goal of evaluating the effects of exposure on neurodevelopment. Our methods identified critical neurodevelopmental windows showing significant sensitivity to metal mixtures. Copyright © 2017 Elsevier Inc. All rights reserved.
Nonparametric tests for interaction and group differences in a two-way layout.
Fisher, A C; Wallenstein, S
1991-01-01
Nonparametric tests of group differences and interaction across strata are developed in which the null hypotheses for these tests are expressed as functions of rho i = P(X > Y) + 1/2P(X = Y), where X refers to a random observation from one group and Y refers to a random observation from the other group within stratum i. The estimator r of the parameter rho is shown to be a useful way to summarize and examine data for ordinal and continuous data.
The Probability of Exceedance as a Nonparametric Person-Fit Statistic for Tests of Moderate Length
ERIC Educational Resources Information Center
Tendeiro, Jorge N.; Meijer, Rob R.
2013-01-01
To classify an item score pattern as not fitting a nonparametric item response theory (NIRT) model, the probability of exceedance (PE) of an observed response vector x can be determined as the sum of the probabilities of all response vectors that are, at most, as likely as x, conditional on the test's total score. Vector x is to be considered…
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Reliability of Test Scores in Nonparametric Item Response Theory.
ERIC Educational Resources Information Center
Sijtsma, Klaas; Molenaar, Ivo W.
1987-01-01
Three methods for estimating reliability are studied within the context of nonparametric item response theory. Two were proposed originally by Mokken and a third is developed in this paper. Using a Monte Carlo strategy, these three estimation methods are compared with four "classical" lower bounds to reliability. (Author/JAZ)
ERIC Educational Resources Information Center
Si, Yajuan; Reiter, Jerome P.
2013-01-01
In many surveys, the data comprise a large number of categorical variables that suffer from item nonresponse. Standard methods for multiple imputation, like log-linear models or sequential regression imputation, can fail to capture complex dependencies and can be difficult to implement effectively in high dimensions. We present a fully Bayesian,…
Asymptotics of nonparametric L-1 regression models with dependent data
ZHAO, ZHIBIAO; WEI, YING; LIN, DENNIS K.J.
2013-01-01
We investigate asymptotic properties of least-absolute-deviation or median quantile estimates of the location and scale functions in nonparametric regression models with dependent data from multiple subjects. Under a general dependence structure that allows for longitudinal data and some spatially correlated data, we establish uniform Bahadur representations for the proposed median quantile estimates. The obtained Bahadur representations provide deep insights into the asymptotic behavior of the estimates. Our main theoretical development is based on studying the modulus of continuity of kernel weighted empirical process through a coupling argument. Progesterone data is used for an illustration. PMID:24955016
A Statistical Approach for Testing Cross-Phenotype Effects of Rare Variants
Broadaway, K. Alaine; Cutler, David J.; Duncan, Richard; Moore, Jacob L.; Ware, Erin B.; Jhun, Min A.; Bielak, Lawrence F.; Zhao, Wei; Smith, Jennifer A.; Peyser, Patricia A.; Kardia, Sharon L.R.; Ghosh, Debashis; Epstein, Michael P.
2016-01-01
Increasing empirical evidence suggests that many genetic variants influence multiple distinct phenotypes. When cross-phenotype effects exist, multivariate association methods that consider pleiotropy are often more powerful than univariate methods that model each phenotype separately. Although several statistical approaches exist for testing cross-phenotype effects for common variants, there is a lack of similar tests for gene-based analysis of rare variants. In order to fill this important gap, we introduce a statistical method for cross-phenotype analysis of rare variants using a nonparametric distance-covariance approach that compares similarity in multivariate phenotypes to similarity in rare-variant genotypes across a gene. The approach can accommodate both binary and continuous phenotypes and further can adjust for covariates. Our approach yields a closed-form test whose significance can be evaluated analytically, thereby improving computational efficiency and permitting application on a genome-wide scale. We use simulated data to demonstrate that our method, which we refer to as the Gene Association with Multiple Traits (GAMuT) test, provides increased power over competing approaches. We also illustrate our approach using exome-chip data from the Genetic Epidemiology Network of Arteriopathy. PMID:26942286
A Bayesian Beta-Mixture Model for Nonparametric IRT (BBM-IRT)
ERIC Educational Resources Information Center
Arenson, Ethan A.; Karabatsos, George
2017-01-01
Item response models typically assume that the item characteristic (step) curves follow a logistic or normal cumulative distribution function, which are strictly monotone functions of person test ability. Such assumptions can be overly-restrictive for real item response data. We propose a simple and more flexible Bayesian nonparametric IRT model…
Nonparametric Item Response Curve Estimation with Correction for Measurement Error
ERIC Educational Resources Information Center
Guo, Hongwen; Sinharay, Sandip
2011-01-01
Nonparametric or kernel regression estimation of item response curves (IRCs) is often used in item analysis in testing programs. These estimates are biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. Accuracy of this estimation is a concern theoretically and operationally.…
A Unifying Framework for Teaching Nonparametric Statistical Tests
ERIC Educational Resources Information Center
Bargagliotti, Anna E.; Orrison, Michael E.
2014-01-01
Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…
Teaching Nonparametric Statistics Using Student Instrumental Values.
ERIC Educational Resources Information Center
Anderson, Jonathan W.; Diddams, Margaret
Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
Nonparametric evaluation of birth cohort trends in disease rates.
Tarone, R E; Chu, K C
2000-01-01
Although interpretation of age-period-cohort analyses is complicated by the non-identifiability of maximum likelihood estimates, changes in the slope of the birth-cohort effect curve are identifiable and have potential aetiologic significance. A nonparametric test for a change in the slope of the birth-cohort trend has been developed. The test is a generalisation of the sign test and is based on permutational distributions. A method for identifying interactions between age and calendar-period effects is also presented. The nonparametric method is shown to be powerful in detecting changes in the slope of the birth-cohort trend, although its power can be reduced considerably by calendar-period patterns of risk. The method identifies a previously unidentified decrease in the birth-cohort risk of lung-cancer mortality from 1912 to 1919, which appears to reflect a reduction in the initiation of smoking by young men at the beginning of the Great Depression (1930s). The method also detects an interaction between age and calendar period in leukemia mortality rates, reflecting the better response of children to chemotherapy. The proposed nonparametric method provides a data analytic approach, which is a useful adjunct to log-linear Poisson analysis of age-period-cohort models, either in the initial model building stage, or in the final interpretation stage.
A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets.
Carrig, Madeline M; Manrique-Vallier, Daniel; Ranby, Krista W; Reiter, Jerome P; Hoyle, Rick H
2015-01-01
Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches.
A Nonparametric, Multiple Imputation-Based Method for the Retrospective Integration of Data Sets
Carrig, Madeline M.; Manrique-Vallier, Daniel; Ranby, Krista W.; Reiter, Jerome P.; Hoyle, Rick H.
2015-01-01
Complex research questions often cannot be addressed adequately with a single data set. One sensible alternative to the high cost and effort associated with the creation of large new data sets is to combine existing data sets containing variables related to the constructs of interest. The goal of the present research was to develop a flexible, broadly applicable approach to the integration of disparate data sets that is based on nonparametric multiple imputation and the collection of data from a convenient, de novo calibration sample. We demonstrate proof of concept for the approach by integrating three existing data sets containing items related to the extent of problematic alcohol use and associations with deviant peers. We discuss both necessary conditions for the approach to work well and potential strengths and weaknesses of the method compared to other data set integration approaches. PMID:26257437
Astigmatism and early academic readiness in preschool children.
Orlansky, Gale; Wilmer, Jeremy; Taub, Marc B; Rutner, Daniella; Ciner, Elise; Gryczynski, Jan
2015-03-01
This study investigated the relationship between uncorrected astigmatism and early academic readiness in at-risk preschool-aged children. A vision screening and academic records review were performed on 122 three- to five-year-old children enrolled in the Philadelphia Head Start program. Vision screening results were related to two measures of early academic readiness, the teacher-reported Work Sampling System (WSS) and the parent-reported Ages and Stages Questionnaire (ASQ). Both measures assess multiple developmental and skill domains thought to be related to academic readiness. Children with astigmatism (defined as >|-0.25| in either eye) were compared with children who had no astigmatism. Associations between astigmatism and specific subscales of the WSS and ASQ were examined using parametric and nonparametric bivariate statistics and regression analyses controlling for age and spherical refractive error. Presence of astigmatism was negatively associated with multiple domains of academic readiness. Children with astigmatism had significantly lower mean scores on Personal and Social Development, Language and Literacy, and Physical Development domains of the WSS, and on Personal/Social, Communication, and Fine Motor domains of the ASQ. These differences between children with astigmatism and children with no astigmatism persisted after statistically adjusting for age and magnitude of spherical refractive error. Nonparametric tests corroborated these findings for the Language and Literacy and Physical Health and Development domains of the WSS and the Communication domain of the ASQ. The presence of astigmatism detected in a screening setting was associated with a pattern of reduced academic readiness in multiple developmental and educational domains among at-risk preschool-aged children. This study may help to establish the role of early vision screenings, comprehensive vision examinations, and the need for refractive correction to improve academic success in preschool children.
Zhao, Zhibiao
2011-06-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise.
Ozarda, Yesim; Ichihara, Kiyoshi; Aslan, Diler; Aybek, Hulya; Ari, Zeki; Taneli, Fatma; Coker, Canan; Akan, Pinar; Sisman, Ali Riza; Bahceci, Onur; Sezgin, Nurzen; Demir, Meltem; Yucel, Gultekin; Akbas, Halide; Ozdem, Sebahat; Polat, Gurbuz; Erbagci, Ayse Binnur; Orkmez, Mustafa; Mete, Nuriye; Evliyaoglu, Osman; Kiyici, Aysel; Vatansev, Husamettin; Ozturk, Bahadir; Yucel, Dogan; Kayaalp, Damla; Dogan, Kubra; Pinar, Asli; Gurbilek, Mehmet; Cetinkaya, Cigdem Damla; Akin, Okhan; Serdar, Muhittin; Kurt, Ismail; Erdinc, Selda; Kadicesme, Ozgur; Ilhan, Necip; Atali, Dilek Sadak; Bakan, Ebubekir; Polat, Harun; Noyan, Tevfik; Can, Murat; Bedir, Abdulkerim; Okuyucu, Ali; Deger, Orhan; Agac, Suret; Ademoglu, Evin; Kaya, Ayşem; Nogay, Turkan; Eren, Nezaket; Dirican, Melahat; Tuncer, GulOzlem; Aykus, Mehmet; Gunes, Yeliz; Ozmen, Sevda Unalli; Kawano, Reo; Tezcan, Sehavet; Demirpence, Ozlem; Degirmen, Elif
2014-12-01
A nationwide multicenter study was organized to establish reference intervals (RIs) in the Turkish population for 25 commonly tested biochemical analytes and to explore sources of variation in reference values, including regionality. Blood samples were collected nationwide in 28 laboratories from the seven regions (≥400 samples/region, 3066 in all). The sera were collectively analyzed in Uludag University in Bursa using Abbott reagents and analyzer. Reference materials were used for standardization of test results. After secondary exclusion using the latent abnormal values exclusion method, RIs were derived by a parametric method employing the modified Box-Cox formula and compared with the RIs by the non-parametric method. Three-level nested ANOVA was used to evaluate variations among sexes, ages and regions. Associations between test results and age, body mass index (BMI) and region were determined by multiple regression analysis (MRA). By ANOVA, differences of reference values among seven regions were significant in none of the 25 analytes. Significant sex-related and age-related differences were observed for 10 and seven analytes, respectively. MRA revealed BMI-related changes in results for uric acid, glucose, triglycerides, high-density lipoprotein (HDL)-cholesterol, alanine aminotransferase, and γ-glutamyltransferase. Their RIs were thus derived by applying stricter criteria excluding individuals with BMI >28 kg/m2. Ranges of RIs by non-parametric method were wider than those by parametric method especially for those analytes affected by BMI. With the lack of regional differences and the well-standardized status of test results, the RIs derived from this nationwide study can be used for the entire Turkish population.
POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.
Peña, Edsel A; Habiger, Joshua D; Wu, Wensong
2011-02-01
Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Šidák procedure for FWER control and the Benjamini-Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to take into account the powers of the individual tests and to have multiple hypotheses decision functions which are not limited to simply using the individual p -values, as is the case, for example, with the Šidák, Bonferroni, or BH procedures. They also enhance understanding of the role of the powers of individual tests, or more precisely the receiver operating characteristic (ROC) functions of decision processes, in the search for better multiple hypotheses testing procedures. A decision-theoretic framework is utilized, and through auxiliary randomizers the procedures could be used with discrete or mixed-type data or with rank-based nonparametric tests. This is in contrast to existing p -value based procedures whose theoretical validity is contingent on each of these p -value statistics being stochastically equal to or greater than a standard uniform variable under the null hypothesis. Proposed procedures are relevant in the analysis of high-dimensional "large M , small n " data sets arising in the natural, physical, medical, economic and social sciences, whose generation and creation is accelerated by advances in high-throughput technology, notably, but not limited to, microarray technology.
NASA Astrophysics Data System (ADS)
Romero, C.; McWilliam, M.; Macías-Pérez, J.-F.; Adam, R.; Ade, P.; André, P.; Aussel, H.; Beelen, A.; Benoît, A.; Bideaud, A.; Billot, N.; Bourrion, O.; Calvo, M.; Catalano, A.; Coiffard, G.; Comis, B.; de Petris, M.; Désert, F.-X.; Doyle, S.; Goupy, J.; Kramer, C.; Lagache, G.; Leclercq, S.; Lestrade, J.-F.; Mauskopf, P.; Mayet, F.; Monfardini, A.; Pascale, E.; Perotto, L.; Pisano, G.; Ponthieu, N.; Revéret, V.; Ritacco, A.; Roussel, H.; Ruppin, F.; Schuster, K.; Sievers, A.; Triqueneaux, S.; Tucker, C.; Zylka, R.
2018-04-01
Context. In the past decade, sensitive, resolved Sunyaev-Zel'dovich (SZ) studies of galaxy clusters have become common. Whereas many previous SZ studies have parameterized the pressure profiles of galaxy clusters, non-parametric reconstructions will provide insights into the thermodynamic state of the intracluster medium. Aim. We seek to recover the non-parametric pressure profiles of the high redshift (z = 0.89) galaxy cluster CLJ 1226.9+3332 as inferred from SZ data from the MUSTANG, NIKA, Bolocam, and Planck instruments, which all probe different angular scales. Methods: Our non-parametric algorithm makes use of logarithmic interpolation, which under the assumption of ellipsoidal symmetry is analytically integrable. For MUSTANG, NIKA, and Bolocam we derive a non-parametric pressure profile independently and find good agreement among the instruments. In particular, we find that the non-parametric profiles are consistent with a fitted generalized Navaro-Frenk-White (gNFW) profile. Given the ability of Planck to constrain the total signal, we include a prior on the integrated Compton Y parameter as determined by Planck. Results: For a given instrument, constraints on the pressure profile diminish rapidly beyond the field of view. The overlap in spatial scales probed by these four datasets is therefore critical in checking for consistency between instruments. By using multiple instruments, our analysis of CLJ 1226.9+3332 covers a large radial range, from the central regions to the cluster outskirts: 0.05 R500 < r < 1.1 R500. This is a wider range of spatial scales than is typically recovered by SZ instruments. Similar analyses will be possible with the new generation of SZ instruments such as NIKA2 and MUSTANG2.
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES
He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.
2017-01-01
Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
Robust non-parametric one-sample tests for the analysis of recurrent events.
Rebora, Paola; Galimberti, Stefania; Valsecchi, Maria Grazia
2010-12-30
One-sample non-parametric tests are proposed here for inference on recurring events. The focus is on the marginal mean function of events and the basis for inference is the standardized distance between the observed and the expected number of events under a specified reference rate. Different weights are considered in order to account for various types of alternative hypotheses on the mean function of the recurrent events process. A robust version and a stratified version of the test are also proposed. The performance of these tests was investigated through simulation studies under various underlying event generation processes, such as homogeneous and nonhomogeneous Poisson processes, autoregressive and renewal processes, with and without frailty effects. The robust versions of the test have been shown to be suitable in a wide variety of event generating processes. The motivating context is a study on gene therapy in a very rare immunodeficiency in children, where a major end-point is the recurrence of severe infections. Robust non-parametric one-sample tests for recurrent events can be useful to assess efficacy and especially safety in non-randomized studies or in epidemiological studies for comparison with a standard population. Copyright © 2010 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.
2011-05-01
Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation period. It is found that both methods yield merged fields of better quality than the original radar field or fields obtained by OK of gauge data. The newly suggested KED formulation is shown to be beneficial, in particular in mountainous regions where the quality of the Swiss radar composite is comparatively low. An analysis of the Kriging variances shows that none of the methods tested here provides a satisfactory uncertainty estimate. A suitable variable transformation is expected to improve this.
NASA Astrophysics Data System (ADS)
Schiemann, R.; Erdin, R.; Willi, M.; Frei, C.; Berenguer, M.; Sempere-Torres, D.
2010-09-01
Modelling spatial covariance is an essential part of all geostatistical methods. Traditionally, parametric semivariogram models are fit from available data. More recently, it has been suggested to use nonparametric correlograms obtained from spatially complete data fields. Here, both estimation techniques are compared. Nonparametric correlograms are shown to have a substantial negative bias. Nonetheless, when combined with the sample variance of the spatial field under consideration, they yield an estimate of the semivariogram that is unbiased for small lag distances. This justifies the use of this estimation technique in geostatistical applications. Various formulations of geostatistical combination (Kriging) methods are used here for the construction of hourly precipitation grids for Switzerland based on data from a sparse realtime network of raingauges and from a spatially complete radar composite. Two variants of Ordinary Kriging (OK) are used to interpolate the sparse gauge observations. In both OK variants, the radar data are only used to determine the semivariogram model. One variant relies on a traditional parametric semivariogram estimate, whereas the other variant uses the nonparametric correlogram. The variants are tested for three cases and the impact of the semivariogram model on the Kriging prediction is illustrated. For the three test cases, the method using nonparametric correlograms performs equally well or better than the traditional method, and at the same time offers great practical advantages. Furthermore, two variants of Kriging with external drift (KED) are tested, both of which use the radar data to estimate nonparametric correlograms, and as the external drift variable. The first KED variant has been used previously for geostatistical radar-raingauge merging in Catalonia (Spain). The second variant is newly proposed here and is an extension of the first. Both variants are evaluated for the three test cases as well as an extended evaluation period. It is found that both methods yield merged fields of better quality than the original radar field or fields obtained by OK of gauge data. The newly suggested KED formulation is shown to be beneficial, in particular in mountainous regions where the quality of the Swiss radar composite is comparatively low. An analysis of the Kriging variances shows that none of the methods tested here provides a satisfactory uncertainty estimate. A suitable variable transformation is expected to improve this.
Testing the Hypothesis of a Homoscedastic Error Term in Simple, Nonparametric Regression
ERIC Educational Resources Information Center
Wilcox, Rand R.
2006-01-01
Consider the nonparametric regression model Y = m(X)+ [tau](X)[epsilon], where X and [epsilon] are independent random variables, [epsilon] has a median of zero and variance [sigma][squared], [tau] is some unknown function used to model heteroscedasticity, and m(X) is an unknown function reflecting some conditional measure of location associated…
Investigation of a Nonparametric Procedure for Assessing Goodness-of-Fit in Item Response Theory
ERIC Educational Resources Information Center
Wells, Craig S.; Bolt, Daniel M.
2008-01-01
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
Measurement Error in Nonparametric Item Response Curve Estimation. Research Report. ETS RR-11-28
ERIC Educational Resources Information Center
Guo, Hongwen; Sinharay, Sandip
2011-01-01
Nonparametric, or kernel, estimation of item response curve (IRC) is a concern theoretically and operationally. Accuracy of this estimation, often used in item analysis in testing programs, is biased when the observed scores are used as the regressor because the observed scores are contaminated by measurement error. In this study, we investigate…
A Comparative Study of Test Data Dimensionality Assessment Procedures Under Nonparametric IRT Models
ERIC Educational Resources Information Center
van Abswoude, Alexandra A. H.; van der Ark, L. Andries; Sijtsma, Klaas
2004-01-01
In this article, an overview of nonparametric item response theory methods for determining the dimensionality of item response data is provided. Four methods were considered: MSP, DETECT, HCA/CCPROX, and DIMTEST. First, the methods were compared theoretically. Second, a simulation study was done to compare the effectiveness of MSP, DETECT, and…
[Do we always correctly interpret the results of statistical nonparametric tests].
Moczko, Jerzy A
2014-01-01
Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.
Season of birth and multiple sclerosis in Tunisia.
Sidhom, Youssef; Kacem, Imen; Bayoudh, Lamia; Ben Djebara, Mouna; Hizem, Yosr; Ben Abdelfettah, Sami; Gargouri, Amina; Gouider, Riadh
2015-11-01
Recent studies on date of birth of multiple sclerosis (MS) patients showed an association between month of birth and the risk of developing MS. This association has not been investigated in an African country. We aimed to determine if the risk of MS is associated with month of birth in Tunisia. Data concerning date of birth for MS patients in Tunisia (n = 1912) was obtained. Birth rates of MS patients were compared with all births in Tunisia matched by year of birth (n = 11,615,912). We used a chi-squared analysis and the Hewitt's non-parametric test for seasonality. The distribution of births among MS patients compared with the control population was not different when tested by the chi-squared test. The Hewitt's test for seasonality showed an excess of births between May and October among MS patients (p = 0.03). The peak of Births of MS patients in Tunisia was in July and the nadir in December. Our data does support the seasonality hypothesis of month of birth as risk factor for MS in Tunisia. Low vitamin D levels during pregnancy could be a possible explanation that needs further investigation. Copyright © 2015 Elsevier B.V. All rights reserved.
Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Goodness-Of-Fit Test for Nonparametric Regression Models: Smoothing Spline ANOVA Models as Example.
Teran Hidalgo, Sebastian J; Wu, Michael C; Engel, Stephanie M; Kosorok, Michael R
2018-06-01
Nonparametric regression models do not require the specification of the functional form between the outcome and the covariates. Despite their popularity, the amount of diagnostic statistics, in comparison to their parametric counter-parts, is small. We propose a goodness-of-fit test for nonparametric regression models with linear smoother form. In particular, we apply this testing framework to smoothing spline ANOVA models. The test can consider two sources of lack-of-fit: whether covariates that are not currently in the model need to be included, and whether the current model fits the data well. The proposed method derives estimated residuals from the model. Then, statistical dependence is assessed between the estimated residuals and the covariates using the HSIC. If dependence exists, the model does not capture all the variability in the outcome associated with the covariates, otherwise the model fits the data well. The bootstrap is used to obtain p-values. Application of the method is demonstrated with a neonatal mental development data analysis. We demonstrate correct type I error as well as power performance through simulations.
Bayard, David S.; Neely, Michael
2016-01-01
An experimental design approach is presented for individualized therapy in the special case where the prior information is specified by a nonparametric (NP) population model. Here, a nonparametric model refers to a discrete probability model characterized by a finite set of support points and their associated weights. An important question arises as to how to best design experiments for this type of model. Many experimental design methods are based on Fisher Information or other approaches originally developed for parametric models. While such approaches have been used with some success across various applications, it is interesting to note that they largely fail to address the fundamentally discrete nature of the nonparametric model. Specifically, the problem of identifying an individual from a nonparametric prior is more naturally treated as a problem of classification, i.e., to find a support point that best matches the patient’s behavior. This paper studies the discrete nature of the NP experiment design problem from a classification point of view. Several new insights are provided including the use of Bayes Risk as an information measure, and new alternative methods for experiment design. One particular method, denoted as MMopt (Multiple-Model Optimal), will be examined in detail and shown to require minimal computation while having distinct advantages compared to existing approaches. Several simulated examples, including a case study involving oral voriconazole in children, are given to demonstrate the usefulness of MMopt in pharmacokinetics applications. PMID:27909942
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2010-07-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2013-01-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
NASA Astrophysics Data System (ADS)
Anees, Asim; Aryal, Jagannath; O'Reilly, Małgorzata M.; Gale, Timothy J.; Wardlaw, Tim
2016-12-01
A robust non-parametric framework, based on multiple Radial Basic Function (RBF) kernels, is proposed in this study, for detecting land/forest cover changes using Landsat 7 ETM+ images. One of the widely used frameworks is to find change vectors (difference image) and use a supervised classifier to differentiate between change and no-change. The Bayesian Classifiers e.g. Maximum Likelihood Classifier (MLC), Naive Bayes (NB), are widely used probabilistic classifiers which assume parametric models, e.g. Gaussian function, for the class conditional distributions. However, their performance can be limited if the data set deviates from the assumed model. The proposed framework exploits the useful properties of Least Squares Probabilistic Classifier (LSPC) formulation i.e. non-parametric and probabilistic nature, to model class posterior probabilities of the difference image using a linear combination of a large number of Gaussian kernels. To this end, a simple technique, based on 10-fold cross-validation is also proposed for tuning model parameters automatically instead of selecting a (possibly) suboptimal combination from pre-specified lists of values. The proposed framework has been tested and compared with Support Vector Machine (SVM) and NB for detection of defoliation, caused by leaf beetles (Paropsisterna spp.) in Eucalyptus nitens and Eucalyptus globulus plantations of two test areas, in Tasmania, Australia, using raw bands and band combination indices of Landsat 7 ETM+. It was observed that due to multi-kernel non-parametric formulation and probabilistic nature, the LSPC outperforms parametric NB with Gaussian assumption in change detection framework, with Overall Accuracy (OA) ranging from 93.6% (κ = 0.87) to 97.4% (κ = 0.94) against 85.3% (κ = 0.69) to 93.4% (κ = 0.85), and is more robust to changing data distributions. Its performance was comparable to SVM, with added advantages of being probabilistic and capable of handling multi-class problems naturally with its original formulation.
Confidence intervals for single-case effect size measures based on randomization test inversion.
Michiels, Bart; Heyvaert, Mieke; Meulders, Ann; Onghena, Patrick
2017-02-01
In the current paper, we present a method to construct nonparametric confidence intervals (CIs) for single-case effect size measures in the context of various single-case designs. We use the relationship between a two-sided statistical hypothesis test at significance level α and a 100 (1 - α) % two-sided CI to construct CIs for any effect size measure θ that contain all point null hypothesis θ values that cannot be rejected by the hypothesis test at significance level α. This method of hypothesis test inversion (HTI) can be employed using a randomization test as the statistical hypothesis test in order to construct a nonparametric CI for θ. We will refer to this procedure as randomization test inversion (RTI). We illustrate RTI in a situation in which θ is the unstandardized and the standardized difference in means between two treatments in a completely randomized single-case design. Additionally, we demonstrate how RTI can be extended to other types of single-case designs. Finally, we discuss a few challenges for RTI as well as possibilities when using the method with other effect size measures, such as rank-based nonoverlap indices. Supplementary to this paper, we provide easy-to-use R code, which allows the user to construct nonparametric CIs according to the proposed method.
Frömke, Cornelia; Hothorn, Ludwig A; Kropf, Siegfried
2008-01-27
In many research areas it is necessary to find differences between treatment groups with several variables. For example, studies of microarray data seek to find a significant difference in location parameters from zero or one for ratios thereof for each variable. However, in some studies a significant deviation of the difference in locations from zero (or 1 in terms of the ratio) is biologically meaningless. A relevant difference or ratio is sought in such cases. This article addresses the use of relevance-shifted tests on ratios for a multivariate parallel two-sample group design. Two empirical procedures are proposed which embed the relevance-shifted test on ratios. As both procedures test a hypothesis for each variable, the resulting multiple testing problem has to be considered. Hence, the procedures include a multiplicity correction. Both procedures are extensions of available procedures for point null hypotheses achieving exact control of the familywise error rate. Whereas the shift of the null hypothesis alone would give straight-forward solutions, the problems that are the reason for the empirical considerations discussed here arise by the fact that the shift is considered in both directions and the whole parameter space in between these two limits has to be accepted as null hypothesis. The first algorithm to be discussed uses a permutation algorithm, and is appropriate for designs with a moderately large number of observations. However, many experiments have limited sample sizes. Then the second procedure might be more appropriate, where multiplicity is corrected according to a concept of data-driven order of hypotheses.
Nonparametric Statistics Test Software Package.
1983-09-01
statis- tics because of their acceptance in the academic world, the availability of computer support, and flexibility in model builling. Nonparametric...25 I1l,lCELL WRITE(NCF,12 ) IvE (I ,RCCT(I) 122 FORMAT(IlXt 3(H5 9 1) IF( IeLT *NCELL) WRITE (NOF1123 J PARTV(I1J 123 FORMAT( Xll----’,FIo.3J 25 CONT
Nonparametric model validations for hidden Markov models with applications in financial econometrics
Zhao, Zhibiao
2011-01-01
We address the nonparametric model validation problem for hidden Markov models with partially observable variables and hidden states. We achieve this goal by constructing a nonparametric simultaneous confidence envelope for transition density function of the observable variables and checking whether the parametric density estimate is contained within such an envelope. Our specification test procedure is motivated by a functional connection between the transition density of the observable variables and the Markov transition kernel of the hidden states. Our approach is applicable for continuous time diffusion models, stochastic volatility models, nonlinear time series models, and models with market microstructure noise. PMID:21750601
2011-01-01
Background The identification of genes or quantitative trait loci that are expressed in response to different environmental factors such as temperature and light, through functional mapping, critically relies on precise modeling of the covariance structure. Previous work used separable parametric covariance structures, such as a Kronecker product of autoregressive one [AR(1)] matrices, that do not account for interaction effects of different environmental factors. Results We implement a more robust nonparametric covariance estimator to model these interactions within the framework of functional mapping of reaction norms to two signals. Our results from Monte Carlo simulations show that this estimator can be useful in modeling interactions that exist between two environmental signals. The interactions are simulated using nonseparable covariance models with spatio-temporal structural forms that mimic interaction effects. Conclusions The nonparametric covariance estimator has an advantage over separable parametric covariance estimators in the detection of QTL location, thus extending the breadth of use of functional mapping in practical settings. PMID:21269481
Assessment of Communications-related Admissions Criteria in a Three-year Pharmacy Program
Tejada, Frederick R.; Lang, Lynn A.; Purnell, Miriam; Acedera, Lisa; Ngonga, Ferdinand
2015-01-01
Objective. To determine if there is a correlation between TOEFL and other admissions criteria that assess communications skills (ie, PCAT variables: verbal, reading, essay, and composite), interview, and observational scores and to evaluate TOEFL and these admissions criteria as predictors of academic performance. Methods. Statistical analyses included two sample t tests, multiple regression and Pearson’s correlations for parametric variables, and Mann-Whitney U for nonparametric variables, which were conducted on the retrospective data of 162 students, 57 of whom were foreign-born. Results. The multiple regression model of the other admissions criteria on TOEFL was significant. There was no significant correlation between TOEFL scores and academic performance. However, significant correlations were found between the other admissions criteria and academic performance. Conclusion. Since TOEFL is not a significant predictor of either communication skills or academic success of foreign-born PharmD students in the program, it may be eliminated as an admissions criterion. PMID:26430273
Assessment of Communications-related Admissions Criteria in a Three-year Pharmacy Program.
Parmar, Jayesh R; Tejada, Frederick R; Lang, Lynn A; Purnell, Miriam; Acedera, Lisa; Ngonga, Ferdinand
2015-08-25
To determine if there is a correlation between TOEFL and other admissions criteria that assess communications skills (ie, PCAT variables: verbal, reading, essay, and composite), interview, and observational scores and to evaluate TOEFL and these admissions criteria as predictors of academic performance. Statistical analyses included two sample t tests, multiple regression and Pearson's correlations for parametric variables, and Mann-Whitney U for nonparametric variables, which were conducted on the retrospective data of 162 students, 57 of whom were foreign-born. The multiple regression model of the other admissions criteria on TOEFL was significant. There was no significant correlation between TOEFL scores and academic performance. However, significant correlations were found between the other admissions criteria and academic performance. Since TOEFL is not a significant predictor of either communication skills or academic success of foreign-born PharmD students in the program, it may be eliminated as an admissions criterion.
Kerschbamer, Rudolf
2015-05-01
This paper proposes a geometric delineation of distributional preference types and a non-parametric approach for their identification in a two-person context. It starts with a small set of assumptions on preferences and shows that this set (i) naturally results in a taxonomy of distributional archetypes that nests all empirically relevant types considered in previous work; and (ii) gives rise to a clean experimental identification procedure - the Equality Equivalence Test - that discriminates between archetypes according to core features of preferences rather than properties of specific modeling variants. As a by-product the test yields a two-dimensional index of preference intensity.
Multicategorical Spline Model for Item Response Theory.
ERIC Educational Resources Information Center
Abrahamowicz, Michal; Ramsay, James O.
1992-01-01
A nonparametric multicategorical model for multiple-choice data is proposed as an extension of the binary spline model of J. O. Ramsay and M. Abrahamowicz (1989). Results of two Monte Carlo studies illustrate the model, which approximates probability functions by rational splines. (SLD)
Parametric vs. non-parametric statistics of low resolution electromagnetic tomography (LORETA).
Thatcher, R W; North, D; Biver, C
2005-01-01
This study compared the relative statistical sensitivity of non-parametric and parametric statistics of 3-dimensional current sources as estimated by the EEG inverse solution Low Resolution Electromagnetic Tomography (LORETA). One would expect approximately 5% false positives (classification of a normal as abnormal) at the P < .025 level of probability (two tailed test) and approximately 1% false positives at the P < .005 level. EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) from 43 normal adult subjects were imported into the Key Institute's LORETA program. We then used the Key Institute's cross-spectrum and the Key Institute's LORETA output files (*.lor) as the 2,394 gray matter pixel representation of 3-dimensional currents at different frequencies. The mean and standard deviation *.lor files were computed for each of the 2,394 gray matter pixels for each of the 43 subjects. Tests of Gaussianity and different transforms were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of parametric vs. non-parametric statistics were compared using a "leave-one-out" cross validation method in which individual normal subjects were withdrawn and then statistically classified as being either normal or abnormal based on the remaining subjects. Log10 transforms approximated Gaussian distribution in the range of 95% to 99% accuracy. Parametric Z score tests at P < .05 cross-validation demonstrated an average misclassification rate of approximately 4.25%, and range over the 2,394 gray matter pixels was 27.66% to 0.11%. At P < .01 parametric Z score cross-validation false positives were 0.26% and ranged from 6.65% to 0% false positives. The non-parametric Key Institute's t-max statistic at P < .05 had an average misclassification error rate of 7.64% and ranged from 43.37% to 0.04% false positives. The nonparametric t-max at P < .01 had an average misclassification rate of 6.67% and ranged from 41.34% to 0% false positives of the 2,394 gray matter pixels for any cross-validated normal subject. In conclusion, adequate approximation to Gaussian distribution and high cross-validation can be achieved by the Key Institute's LORETA programs by using a log10 transform and parametric statistics, and parametric normative comparisons had lower false positive rates than the non-parametric tests.
ERIC Educational Resources Information Center
Wagstaff, David A.; Elek, Elvira; Kulis, Stephen; Marsiglia, Flavio
2009-01-01
A nonparametric bootstrap was used to obtain an interval estimate of Pearson's "r," and test the null hypothesis that there was no association between 5th grade students' positive substance use expectancies and their intentions to not use substances. The students were participating in a substance use prevention program in which the unit of…
A nonparametric smoothing method for assessing GEE models with longitudinal binary data.
Lin, Kuo-Chin; Chen, Yi-Ju; Shyr, Yu
2008-09-30
Studies involving longitudinal binary responses are widely applied in the health and biomedical sciences research and frequently analyzed by generalized estimating equations (GEE) method. This article proposes an alternative goodness-of-fit test based on the nonparametric smoothing approach for assessing the adequacy of GEE fitted models, which can be regarded as an extension of the goodness-of-fit test of le Cessie and van Houwelingen (Biometrics 1991; 47:1267-1282). The expectation and approximate variance of the proposed test statistic are derived. The asymptotic distribution of the proposed test statistic in terms of a scaled chi-squared distribution and the power performance of the proposed test are discussed by simulation studies. The testing procedure is demonstrated by two real data. Copyright (c) 2008 John Wiley & Sons, Ltd.
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
NASA Astrophysics Data System (ADS)
Bono, R. K.; Dare, M. S.; Tarduno, J. A.; Cottrell, R. D.
2016-12-01
Magnetic directions from coarse clastic rocks are typically highly scattered, to the point that the null hypothesis that they are drawn from a random distribution, using the iconic test of Watson (1956), cannot be rejected at a high confidence level (e.g. 95%). Here, we use an alternative approach of searching for directional clusters on a sphere. When applied to a new data set of directions from quartzites from the Jack Hills of Western Australia, we find evidence for distinct and meaningful magnetic directions at low (200 to 300 degrees C) and intermediate ( 350 to 450 degrees C) unblocking temperatures, whereas the test of Watson (1956) fails to draw a distinction from random distributions for the ensemble of directions at these unblocking temperature ranges. The robustness of the directional groups identified by the cluster analysis is confirmed by non-parametric resampling tests. The lowest unblocking temperature directional mode appears related to the present day field, perhaps contaminated by viscous magnetizations. The intermediate temperature magnetization matches an overprint recorded by the secondary mineral fuchsite (Cottrell et al., 2016) acquired at ca. 2.65 Ga. These data thus indicate that the Jack Hills carry an overprint at intermediate unblocking temperatures of Archean age. We find no evidence for a 1 Ga remagnetization. In general, the application of cluster analysis on a sphere, with directions confirmed by nonparametric tests, represents a new approach that should be applied when evaluating data with high dispersion, such as those that typically come from weak coarse-grained clastic sedimentary rocks, and/or rocks that have seen several tectonic events that could have imparted multiple magnetic overprints.
Regression analysis of current-status data: an application to breast-feeding.
Grummer-strawn, L M
1993-09-01
"Although techniques for calculating mean survival time from current-status data are well known, their use in multiple regression models is somewhat troublesome. Using data on current breast-feeding behavior, this article considers a number of techniques that have been suggested in the literature, including parametric, nonparametric, and semiparametric models as well as the application of standard schedules. Models are tested in both proportional-odds and proportional-hazards frameworks....I fit [the] models to current status data on breast-feeding from the Demographic and Health Survey (DHS) in six countries: two African (Mali and Ondo State, Nigeria), two Asian (Indonesia and Sri Lanka), and two Latin American (Colombia and Peru)." excerpt
A Study of Specific Fracture Energy at Percussion Drilling
NASA Astrophysics Data System (ADS)
A, Shadrina; T, Kabanova; V, Krets; L, Saruev
2014-08-01
The paper presents experimental studies of rock failure provided by percussion drilling. Quantification and qualitative analysis were carried out to estimate critical values of rock failure depending on the hammer pre-impact velocity, types of drill bits and cylindrical hammer parameters (weight, length, diameter), and turn angle of a drill bit. Obtained data in this work were compared with obtained results by other researchers. The particle-size distribution in granite-cutting sludge was analyzed in this paper. Statistical approach (Spearmen's rank-order correlation, multiple regression analysis with dummy variables, Kruskal-Wallis nonparametric test) was used to analyze the drilling process. Experimental data will be useful for specialists engaged in simulation and illustration of rock failure.
Bayesian non-parametric inference for stochastic epidemic models using Gaussian Processes.
Xu, Xiaoguang; Kypraios, Theodore; O'Neill, Philip D
2016-10-01
This paper considers novel Bayesian non-parametric methods for stochastic epidemic models. Many standard modeling and data analysis methods use underlying assumptions (e.g. concerning the rate at which new cases of disease will occur) which are rarely challenged or tested in practice. To relax these assumptions, we develop a Bayesian non-parametric approach using Gaussian Processes, specifically to estimate the infection process. The methods are illustrated with both simulated and real data sets, the former illustrating that the methods can recover the true infection process quite well in practice, and the latter illustrating that the methods can be successfully applied in different settings. © The Author 2016. Published by Oxford University Press.
Transformation-invariant and nonparametric monotone smooth estimation of ROC curves.
Du, Pang; Tang, Liansheng
2009-01-30
When a new diagnostic test is developed, it is of interest to evaluate its accuracy in distinguishing diseased subjects from non-diseased subjects. The accuracy of the test is often evaluated by receiver operating characteristic (ROC) curves. Smooth ROC estimates are often preferable for continuous test results when the underlying ROC curves are in fact continuous. Nonparametric and parametric methods have been proposed by various authors to obtain smooth ROC curve estimates. However, there are certain drawbacks with the existing methods. Parametric methods need specific model assumptions. Nonparametric methods do not always satisfy the inherent properties of the ROC curves, such as monotonicity and transformation invariance. In this paper we propose a monotone spline approach to obtain smooth monotone ROC curves. Our method ensures important inherent properties of the underlying ROC curves, which include monotonicity, transformation invariance, and boundary constraints. We compare the finite sample performance of the newly proposed ROC method with other ROC smoothing methods in large-scale simulation studies. We illustrate our method through a real life example. Copyright (c) 2008 John Wiley & Sons, Ltd.
Spectral decompositions of multiple time series: a Bayesian non-parametric approach.
Macaro, Christian; Prado, Raquel
2014-01-01
We consider spectral decompositions of multiple time series that arise in studies where the interest lies in assessing the influence of two or more factors. We write the spectral density of each time series as a sum of the spectral densities associated to the different levels of the factors. We then use Whittle's approximation to the likelihood function and follow a Bayesian non-parametric approach to obtain posterior inference on the spectral densities based on Bernstein-Dirichlet prior distributions. The prior is strategically important as it carries identifiability conditions for the models and allows us to quantify our degree of confidence in such conditions. A Markov chain Monte Carlo (MCMC) algorithm for posterior inference within this class of frequency-domain models is presented.We illustrate the approach by analyzing simulated and real data via spectral one-way and two-way models. In particular, we present an analysis of functional magnetic resonance imaging (fMRI) brain responses measured in individuals who participated in a designed experiment to study pain perception in humans.
Testing independence of bivariate interval-censored data using modified Kendall's tau statistic.
Kim, Yuneung; Lim, Johan; Park, DoHwan
2015-11-01
In this paper, we study a nonparametric procedure to test independence of bivariate interval censored data; for both current status data (case 1 interval-censored data) and case 2 interval-censored data. To do it, we propose a score-based modification of the Kendall's tau statistic for bivariate interval-censored data. Our modification defines the Kendall's tau statistic with expected numbers of concordant and disconcordant pairs of data. The performance of the modified approach is illustrated by simulation studies and application to the AIDS study. We compare our method to alternative approaches such as the two-stage estimation method by Sun et al. (Scandinavian Journal of Statistics, 2006) and the multiple imputation method by Betensky and Finkelstein (Statistics in Medicine, 1999b). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Lin, Lawrence; Pan, Yi; Hedayat, A S; Barnhart, Huiman X; Haber, Michael
2016-01-01
Total deviation index (TDI) captures a prespecified quantile of the absolute deviation of paired observations from raters, observers, methods, assays, instruments, etc. We compare the performance of TDI using nonparametric quantile regression to the TDI assuming normality (Lin, 2000). This simulation study considers three distributions: normal, Poisson, and uniform at quantile levels of 0.8 and 0.9 for cases with and without contamination. Study endpoints include the bias of TDI estimates (compared with their respective theoretical values), standard error of TDI estimates (compared with their true simulated standard errors), and test size (compared with 0.05), and power. Nonparametric TDI using quantile regression, although it slightly underestimates and delivers slightly less power for data without contamination, works satisfactorily under all simulated cases even for moderate (say, ≥40) sample sizes. The performance of the TDI based on a quantile of 0.8 is in general superior to that of 0.9. The performances of nonparametric and parametric TDI methods are compared with a real data example. Nonparametric TDI can be very useful when the underlying distribution on the difference is not normal, especially when it has a heavy tail.
Nonparametric analysis of bivariate gap time with competing risks.
Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng
2016-09-01
This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall's tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article. © 2016, The International Biometric Society.
Metzger, Aude; Le Bars, Emmanuelle; Deverdun, Jeremy; Molino, François; Maréchal, Bénédicte; Picot, Marie-Christine; Ayrignac, Xavier; Carra, Clarisse; Bauchet, Luc; Krainik, Alexandre; Labauge, Pierre; Menjot de Champfleur, Nicolas
2018-03-01
The link between cerebral vasoreactivity and cognitive status in multiple sclerosis remains unclear. The aim of the present study was to investigate a potential decrease of cerebral vasoreactivity in multiple sclerosis patients and correlate it with cognitive status. Thirty-three patients with multiple sclerosis (nine progressive and 24 remitting forms, median age: 39 years, 12 males) and 22 controls underwent MRI with a hypercapnic challenge to assess cerebral vasoreactivity and a neuropsychological assessment. Cerebral vasoreactivity, measured as the cerebral blood flow percent increase normalised by end-tidal carbon dioxide variation, was assessed globally and by regions of interest using the blood oxygen level-dependent technique. Non-parametric statistics tests were used to assess differences between groups, and associations were estimated using linear models. Cerebral vasoreactivity was lower in patients with cognitive impairment than in cognitively normal patients (p=0.004) and was associated with education level in patients (R 2 = 0.35; p = 0.047). There was no decrease in cerebral vasoreactivity between patients and controls. Cognitive impairment in multiple sclerosis may be mediated through decreased cerebral vasoreactivity. Cerebral vasoreactivity could therefore be considered as a marker of cognitive decline in multiple sclerosis. • Cerebral vasoreactivity does not differ between multiple sclerosis patients and controls. • Cerebral vasoreactivity measure is linked to cognitive impairment in multiple sclerosis. • Cerebral vasoreactivity is linked to level of education in multiple sclerosis.
Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure
Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas
2015-01-01
Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014
Simple F Test Reveals Gene-Gene Interactions in Case-Control Studies
Chen, Guanjie; Yuan, Ao; Zhou, Jie; Bentley, Amy R.; Adeyemo, Adebowale; Rotimi, Charles N.
2012-01-01
Missing heritability is still a challenge for Genome Wide Association Studies (GWAS). Gene-gene interactions may partially explain this residual genetic influence and contribute broadly to complex disease. To analyze the gene-gene interactions in case-control studies of complex disease, we propose a simple, non-parametric method that utilizes the F-statistic. This approach consists of three steps. First, we examine the joint distribution of a pair of SNPs in cases and controls separately. Second, an F-test is used to evaluate the ratio of dependence in cases to that of controls. Finally, results are adjusted for multiple tests. This method was used to evaluate gene-gene interactions that are associated with risk of Type 2 Diabetes among African Americans in the Howard University Family Study. We identified 18 gene-gene interactions (P < 0.0001). Compared with the commonly-used logistical regression method, we demonstrate that the F-ratio test is an efficient approach to measuring gene-gene interactions, especially for studies with limited sample size. PMID:22837643
HBCU Efficiency and Endowments: An Exploratory Analysis
ERIC Educational Resources Information Center
Coupet, Jason; Barnum, Darold
2010-01-01
Discussions of efficiency among Historically Black Colleges and Universities (HBCUs) are often missing in academic conversations. This article seeks to assess efficiency of individual HBCUs using Data Envelopment Analysis (DEA), a non-parametric technique that can synthesize multiple inputs and outputs to determine a single efficiency score for…
Goudriaan, Marije; Van den Hauwe, Marleen; Simon-Martinez, Cristina; Huenaerts, Catherine; Molenaers, Guy; Goemans, Nathalie; Desloovere, Kaat
2018-04-30
Prolonged ambulation is considered important in children with Duchenne muscular dystrophy (DMD). However, previous studies analyzing DMD gait were sensitive to false positive outcomes, caused by uncorrected multiple comparisons, regional focus bias, and inter-component covariance bias. Also, while muscle weakness is often suggested to be the main cause for the altered gait pattern in DMD, this was never verified. Our research question was twofold: 1) are we able to confirm the sagittal kinematic and kinetic gait alterations described in a previous review with statistical non-parametric mapping (SnPM)? And 2) are these gait deviations related to lower limb weakness? We compared gait kinematics and kinetics of 15 children with DMD and 15 typical developing (TD) children (5-17 years), with a two sample Hotelling's T 2 test and post-hoc two-tailed, two-sample t-test. We used canonical correlation analyses to study the relationship between weakness and altered gait parameters. For all analyses, α-level was corrected for multiple comparisons, resulting in α = 0.005. We only found one of the previously reported kinematic deviations: the children with DMD had an increased knee flexion angle during swing (p = 0.0006). Observed gait deviations that were not reported in the review were an increased hip flexion angle during stance (p = 0.0009) and swing (p = 0.0001), altered combined knee and ankle torques (p = 0.0002), and decreased power absorption during stance (p = 0.0001). No relationships between weakness and these gait deviations were found. We were not able to replicate the gait deviations in DMD previously reported in literature, thus DMD gait remains undefined. Further, weakness does not seem to be linearly related to altered gait features. The progressive nature of the disease requires larger study populations and longitudinal analyses to gain more insight into DMD gait and its underlying causes. Copyright © 2018 Elsevier B.V. All rights reserved.
Local kernel nonparametric discriminant analysis for adaptive extraction of complex structures
NASA Astrophysics Data System (ADS)
Li, Quanbao; Wei, Fajie; Zhou, Shenghan
2017-05-01
The linear discriminant analysis (LDA) is one of popular means for linear feature extraction. It usually performs well when the global data structure is consistent with the local data structure. Other frequently-used approaches of feature extraction usually require linear, independence, or large sample condition. However, in real world applications, these assumptions are not always satisfied or cannot be tested. In this paper, we introduce an adaptive method, local kernel nonparametric discriminant analysis (LKNDA), which integrates conventional discriminant analysis with nonparametric statistics. LKNDA is adept in identifying both complex nonlinear structures and the ad hoc rule. Six simulation cases demonstrate that LKNDA have both parametric and nonparametric algorithm advantages and higher classification accuracy. Quartic unilateral kernel function may provide better robustness of prediction than other functions. LKNDA gives an alternative solution for discriminant cases of complex nonlinear feature extraction or unknown feature extraction. At last, the application of LKNDA in the complex feature extraction of financial market activities is proposed.
A comparative study of nonparametric methods for pattern recognition
NASA Technical Reports Server (NTRS)
Hahn, S. F.; Nelson, G. D.
1972-01-01
The applied research discussed in this report determines and compares the correct classification percentage of the nonparametric sign test, Wilcoxon's signed rank test, and K-class classifier with the performance of the Bayes classifier. The performance is determined for data which have Gaussian, Laplacian and Rayleigh probability density functions. The correct classification percentage is shown graphically for differences in modes and/or means of the probability density functions for four, eight and sixteen samples. The K-class classifier performed very well with respect to the other classifiers used. Since the K-class classifier is a nonparametric technique, it usually performed better than the Bayes classifier which assumes the data to be Gaussian even though it may not be. The K-class classifier has the advantage over the Bayes in that it works well with non-Gaussian data without having to determine the probability density function of the data. It should be noted that the data in this experiment was always unimodal.
Using exogenous variables in testing for monotonic trends in hydrologic time series
Alley, William M.
1988-01-01
One approach that has been used in performing a nonparametric test for monotonic trend in a hydrologic time series consists of a two-stage analysis. First, a regression equation is estimated for the variable being tested as a function of an exogenous variable. A nonparametric trend test such as the Kendall test is then performed on the residuals from the equation. By analogy to stagewise regression and through Monte Carlo experiments, it is demonstrated that this approach will tend to underestimate the magnitude of the trend and to result in some loss in power as a result of ignoring the interaction between the exogenous variable and time. An alternative approach, referred to as the adjusted variable Kendall test, is demonstrated to generally have increased statistical power and to provide more reliable estimates of the trend slope. In addition, the utility of including an exogenous variable in a trend test is examined under selected conditions.
Location tests for biomarker studies: a comparison using simulations for the two-sample case.
Scheinhardt, M O; Ziegler, A
2013-01-01
Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O'Gorman, Can J Stat 1997; 25: 269 -279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267- 293; Szymczak et al., Stat Med 2013; 32: 524 - 537] for a wide range of distributions. We simulated two-sample scenarios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test performed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.
Nonparametric Trajectory Analysis of CMAPS Data
As part of the Cleveland Multiple Air Pollutant Study (CMAPS), 30-minute average concentrations of the elemental composition of PM2.5 were made at two sites during the months of August 2009 and February 2010. The elements measured were: Al, As, Ba, Be, Ca, Cd, Ce, Co, Cr, Cs, Cu...
Efforts are increasingly being made to classify the world’s wetland resources, an important ecosystem and habitat that is diminishing in abundance. There are multiple remote sensing classification methods, including a suite of nonparametric classifiers such as decision-tree...
NASA Astrophysics Data System (ADS)
Machiwal, Deepesh; Kumar, Sanjay; Dayal, Devi
2016-05-01
This study aimed at characterization of rainfall dynamics in a hot arid region of Gujarat, India by employing time-series modeling techniques and sustainability approach. Five characteristics, i.e., normality, stationarity, homogeneity, presence/absence of trend, and persistence of 34-year (1980-2013) period annual rainfall time series of ten stations were identified/detected by applying multiple parametric and non-parametric statistical tests. Furthermore, the study involves novelty of proposing sustainability concept for evaluating rainfall time series and demonstrated the concept, for the first time, by identifying the most sustainable rainfall series following reliability ( R y), resilience ( R e), and vulnerability ( V y) approach. Box-whisker plots, normal probability plots, and histograms indicated that the annual rainfall of Mandvi and Dayapar stations is relatively more positively skewed and non-normal compared with that of other stations, which is due to the presence of severe outlier and extreme. Results of Shapiro-Wilk test and Lilliefors test revealed that annual rainfall series of all stations significantly deviated from normal distribution. Two parametric t tests and the non-parametric Mann-Whitney test indicated significant non-stationarity in annual rainfall of Rapar station, where the rainfall was also found to be non-homogeneous based on the results of four parametric homogeneity tests. Four trend tests indicated significantly increasing rainfall trends at Rapar and Gandhidham stations. The autocorrelation analysis suggested the presence of persistence of statistically significant nature in rainfall series of Bhachau (3-year time lag), Mundra (1- and 9-year time lag), Nakhatrana (9-year time lag), and Rapar (3- and 4-year time lag). Results of sustainability approach indicated that annual rainfall of Mundra and Naliya stations ( R y = 0.50 and 0.44; R e = 0.47 and 0.47; V y = 0.49 and 0.46, respectively) are the most sustainable and dependable compared with that of other stations. The highest values of sustainability index at Mundra (0.120) and Naliya (0.112) stations confirmed the earlier findings of R y- R e- V y approach. In general, annual rainfall of the study area is less reliable, less resilient, and moderately vulnerable, which emphasizes the need of developing suitable strategies for managing water resources of the area on sustainable basis. Finally, it is recommended that multiple statistical tests (at least two) should be used in time-series modeling for making reliable decisions. Moreover, methodology and findings of the sustainability concept in rainfall time series can easily be adopted in other arid regions of the world.
A non-parametric consistency test of the ΛCDM model with Planck CMB data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aghamousa, Amir; Shafieloo, Arman; Hamann, Jan, E-mail: amir@aghamousa.com, E-mail: jan.hamann@unsw.edu.au, E-mail: shafieloo@kasi.re.kr
Non-parametric reconstruction methods, such as Gaussian process (GP) regression, provide a model-independent way of estimating an underlying function and its uncertainty from noisy data. We demonstrate how GP-reconstruction can be used as a consistency test between a given data set and a specific model by looking for structures in the residuals of the data with respect to the model's best-fit. Applying this formalism to the Planck temperature and polarisation power spectrum measurements, we test their global consistency with the predictions of the base ΛCDM model. Our results do not show any serious inconsistencies, lending further support to the interpretation ofmore » the base ΛCDM model as cosmology's gold standard.« less
Parametric and nonparametric Granger causality testing: Linkages between international stock markets
NASA Astrophysics Data System (ADS)
De Gooijer, Jan G.; Sivarajasingham, Selliah
2008-04-01
This study investigates long-term linear and nonlinear causal linkages among eleven stock markets, six industrialized markets and five emerging markets of South-East Asia. We cover the period 1987-2006, taking into account the on-set of the Asian financial crisis of 1997. We first apply a test for the presence of general nonlinearity in vector time series. Substantial differences exist between the pre- and post-crisis period in terms of the total number of significant nonlinear relationships. We then examine both periods, using a new nonparametric test for Granger noncausality and the conventional parametric Granger noncausality test. One major finding is that the Asian stock markets have become more internationally integrated after the Asian financial crisis. An exception is the Sri Lankan market with almost no significant long-term linear and nonlinear causal linkages with other markets. To ensure that any causality is strictly nonlinear in nature, we also examine the nonlinear causal relationships of VAR filtered residuals and VAR filtered squared residuals for the post-crisis sample. We find quite a few remaining significant bi- and uni-directional causal nonlinear relationships in these series. Finally, after filtering the VAR-residuals with GARCH-BEKK models, we show that the nonparametric test statistics are substantially smaller in both magnitude and statistical significance than those before filtering. This indicates that nonlinear causality can, to a large extent, be explained by simple volatility effects.
Rajwa, Bartek; Wallace, Paul K.; Griffiths, Elizabeth A.; Dundar, Murat
2017-01-01
Objective Flow cytometry (FC) is a widely acknowledged technology in diagnosis of acute myeloid leukemia (AML) and has been indispensable in determining progression of the disease. Although FC plays a key role as a post-therapy prognosticator and evaluator of therapeutic efficacy, the manual analysis of cytometry data is a barrier to optimization of reproducibility and objectivity. This study investigates the utility of our recently introduced non-parametric Bayesian framework in accurately predicting the direction of change in disease progression in AML patients using FC data. Methods The highly flexible non-parametric Bayesian model based on the infinite mixture of infinite Gaussian mixtures is used for jointly modeling data from multiple FC samples to automatically identify functionally distinct cell populations and their local realizations. Phenotype vectors are obtained by characterizing each sample by the proportions of recovered cell populations, which are in turn used to predict the direction of change in disease progression for each patient. Results We used 200 diseased and non-diseased immunophenotypic panels for training and tested the system with 36 additional AML cases collected at multiple time points. The proposed framework identified the change in direction of disease progression with accuracies of 90% (9 out of 10) for relapsing cases and 100% (26 out of 26) for the remaining cases. Conclusions We believe that these promising results are an important first step towards the development of automated predictive systems for disease monitoring and continuous response evaluation. Significance Automated measurement and monitoring of therapeutic response is critical not only for objective evaluation of disease status prognosis but also for timely assessment of treatment strategies. PMID:27416585
Inference in a Synchronization Game with Social Interactions *
de Paula, Áureo
2009-01-01
This paper studies inference in a continuous time game where an agent's decision to quit an activity depends on the participation of other players. In equilibrium, similar actions can be explained not only by direct influences but also by correlated factors. Our model can be seen as a simultaneous duration model with multiple decision makers and interdependent durations. We study the problem of determining the existence and uniqueness of equilibrium stopping strategies in this setting. This paper provides results and conditions for the detection of these endogenous effects. First, we show that the presence of such effects is a necessary and sufficient condition for simultaneous exits. This allows us to set up a nonparametric test for the presence of such influences which is robust to multiple equilibria. Second, we provide conditions under which parameters in the game are identified. Finally, we apply the model to data on desertion in the Union Army during the American Civil War and find evidence of endogenous influences. PMID:20046804
SPICE: exploration and analysis of post-cytometric complex multivariate datasets.
Roederer, Mario; Nozzi, Joshua L; Nason, Martha C
2011-02-01
Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.
A bias-corrected estimator in multiple imputation for missing data.
Tomita, Hiroaki; Fujisawa, Hironori; Henmi, Masayuki
2018-05-29
Multiple imputation (MI) is one of the most popular methods to deal with missing data, and its use has been rapidly increasing in medical studies. Although MI is rather appealing in practice since it is possible to use ordinary statistical methods for a complete data set once the missing values are fully imputed, the method of imputation is still problematic. If the missing values are imputed from some parametric model, the validity of imputation is not necessarily ensured, and the final estimate for a parameter of interest can be biased unless the parametric model is correctly specified. Nonparametric methods have been also proposed for MI, but it is not so straightforward as to produce imputation values from nonparametrically estimated distributions. In this paper, we propose a new method for MI to obtain a consistent (or asymptotically unbiased) final estimate even if the imputation model is misspecified. The key idea is to use an imputation model from which the imputation values are easily produced and to make a proper correction in the likelihood function after the imputation by using the density ratio between the imputation model and the true conditional density function for the missing variable as a weight. Although the conditional density must be nonparametrically estimated, it is not used for the imputation. The performance of our method is evaluated by both theory and simulation studies. A real data analysis is also conducted to illustrate our method by using the Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset. Copyright © 2018 John Wiley & Sons, Ltd.
A Semiparametric Approach for Composite Functional Mapping of Dynamic Quantitative Traits
Yang, Runqing; Gao, Huijiang; Wang, Xin; Zhang, Ji; Zeng, Zhao-Bang; Wu, Rongling
2007-01-01
Functional mapping has emerged as a powerful tool for mapping quantitative trait loci (QTL) that control developmental patterns of complex dynamic traits. Original functional mapping has been constructed within the context of simple interval mapping, without consideration of separate multiple linked QTL for a dynamic trait. In this article, we present a statistical framework for mapping QTL that affect dynamic traits by capitalizing on the strengths of functional mapping and composite interval mapping. Within this so-called composite functional-mapping framework, functional mapping models the time-dependent genetic effects of a QTL tested within a marker interval using a biologically meaningful parametric function, whereas composite interval mapping models the time-dependent genetic effects of the markers outside the test interval to control the genome background using a flexible nonparametric approach based on Legendre polynomials. Such a semiparametric framework was formulated by a maximum-likelihood model and implemented with the EM algorithm, allowing for the estimation and the test of the mathematical parameters that define the QTL effects and the regression coefficients of the Legendre polynomials that describe the marker effects. Simulation studies were performed to investigate the statistical behavior of composite functional mapping and compare its advantage in separating multiple linked QTL as compared to functional mapping. We used the new mapping approach to analyze a genetic mapping example in rice, leading to the identification of multiple QTL, some of which are linked on the same chromosome, that control the developmental trajectory of leaf age. PMID:17947431
Statistical testing and power analysis for brain-wide association study.
Gong, Weikang; Wan, Lin; Lu, Wenlian; Ma, Liang; Cheng, Fan; Cheng, Wei; Grünewald, Stefan; Feng, Jianfeng
2018-04-05
The identification of connexel-wise associations, which involves examining functional connectivities between pairwise voxels across the whole brain, is both statistically and computationally challenging. Although such a connexel-wise methodology has recently been adopted by brain-wide association studies (BWAS) to identify connectivity changes in several mental disorders, such as schizophrenia, autism and depression, the multiple correction and power analysis methods designed specifically for connexel-wise analysis are still lacking. Therefore, we herein report the development of a rigorous statistical framework for connexel-wise significance testing based on the Gaussian random field theory. It includes controlling the family-wise error rate (FWER) of multiple hypothesis testings using topological inference methods, and calculating power and sample size for a connexel-wise study. Our theoretical framework can control the false-positive rate accurately, as validated empirically using two resting-state fMRI datasets. Compared with Bonferroni correction and false discovery rate (FDR), it can reduce false-positive rate and increase statistical power by appropriately utilizing the spatial information of fMRI data. Importantly, our method bypasses the need of non-parametric permutation to correct for multiple comparison, thus, it can efficiently tackle large datasets with high resolution fMRI images. The utility of our method is shown in a case-control study. Our approach can identify altered functional connectivities in a major depression disorder dataset, whereas existing methods fail. A software package is available at https://github.com/weikanggong/BWAS. Copyright © 2018 Elsevier B.V. All rights reserved.
Normality of raw data in general linear models: The most widespread myth in statistics
Kery, Marc; Hatfield, Jeff S.
2003-01-01
In years of statistical consulting for ecologists and wildlife biologists, by far the most common misconception we have come across has been the one about normality in general linear models. These comprise a very large part of the statistical models used in ecology and include t tests, simple and multiple linear regression, polynomial regression, and analysis of variance (ANOVA) and covariance (ANCOVA). There is a widely held belief that the normality assumption pertains to the raw data rather than to the model residuals. We suspect that this error may also occur in countless published studies, whenever the normality assumption is tested prior to analysis. This may lead to the use of nonparametric alternatives (if there are any), when parametric tests would indeed be appropriate, or to use of transformations of raw data, which may introduce hidden assumptions such as multiplicative effects on the natural scale in the case of log-transformed data. Our aim here is to dispel this myth. We very briefly describe relevant theory for two cases of general linear models to show that the residuals need to be normally distributed if tests requiring normality are to be used, such as t and F tests. We then give two examples demonstrating that the distribution of the response variable may be nonnormal, and yet the residuals are well behaved. We do not go into the issue of how to test normality; instead we display the distributions of response variables and residuals graphically.
Statistical methods used in articles published by the Journal of Periodontal and Implant Science.
Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young
2014-12-01
The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.
NASA Astrophysics Data System (ADS)
Prahutama, Alan; Suparti; Wahyu Utami, Tiani
2018-03-01
Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Comparison of four approaches to a rock facies classification problem
Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.
2007-01-01
In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Yorke, Mantz
2017-01-01
When analysing course-level data by subgroups based upon some demographic characteristics, the numbers in analytical cells are often too small to allow inferences to be drawn that might help in the enhancement of practices. However, relatively simple analyses can provide useful pointers. This article draws upon a study involving a partnership with…
Development of welding emission factors for Cr and Cr(VI) with a confidence level.
Serageldin, Mohamed; Reeves, David W
2009-05-01
Knowledge of the emission rate and release characteristics is necessary for estimating pollutant fate and transport. Because emission measurements at a facility's fence line are generally not readily available, environmental agencies in many countries are using emission factors (EFs) to indicate the quantity of certain pollutants released into the atmosphere from operations such as welding. The amount of fumes and metals generated from a welding process is dependent on many parameters, such as electrode composition, voltage, and current. Because test reports on fume generation provide different levels of detail, a common approach was used to give a test report a quality rating on the basis of several highly subjective criteria; however, weighted average EFs generated in this way are not meant to reflect data precision or to be used for a refined risk analysis. The 95% upper confidence limit (UCL) of the unknown population mean was used in this study to account for the uncertainty in the EF test data. Several parametric UCLs were computed and compared for multiple welding EFs associated with several mild, stainless, and alloy steels. Also, several nonparametric statistical methods, including several bootstrap procedures, were used to compute 95% UCLs. For the nonparametric methods, a distribution for calculating the mean, standard deviation, and other statistical parameters for a dataset does not need to be assumed. There were instances when the sample size was small and instances when EFs for an electrode/process combination were not found. Those two points are addressed in this paper. Finally, this paper is an attempt to deal with the uncertainty in the value of a mean EF for an electrode/process combination that is based on test data from several laboratories. Welding EFs developed with a defined level of confidence may be used as input parameters for risk assessment.
An Item Response Theory Model for Test Bias.
ERIC Educational Resources Information Center
Shealy, Robin; Stout, William
This paper presents a conceptualization of test bias for standardized ability tests which is based on multidimensional, non-parametric, item response theory. An explanation of how individually-biased items can combine through a test score to produce test bias is provided. It is contended that bias, although expressed at the item level, should be…
Uchida, K; Nishimura, T; Takesada, H; Morioka, T; Hagawa, N; Yamamoto, T; Kaga, S; Terada, T; Shinyama, N; Yamamoto, H; Mizobata, Y
2017-08-01
The purpose of this study was to assess the effects of recent surgical rib fixation and establish its indications not only for flail chest but also for multiple rib fractures. Between 2007 and 2015, 187 patients were diagnosed as having multiple rib fractures in our institution. After the propensity score matching was performed, ten patients who had performed surgical rib fixation and ten patients who had treated with non-operative management were included. Categorical variables were analyzed with Fischer's exact test and non-parametric numerical data were compared using the Mann-Whitney U test. Wilcoxon signed-rank test was performed for comparison of pre- and postoperative variables. All statistical data are presented as median (25-75 % interquartile range [IQR]) or number. The surgically treated patients extubated significantly earlier than non-operative management patients (5.5 [1-8] vs 9 [7-12] days: p = 0.019). The duration of continuous intravenous narcotic agents infusion days (4.5 [3-6] vs 12 [9-14] days: p = 0.002) and the duration of intensive care unit stay (6.5 [3-9] vs 12 [8-14] days: p = 0.008) were also significantly shorter in surgically treated patients. Under the same ventilating conditions, the postoperative values of tidal volume and respiratory rate improved significantly compared to those values measured just before the surgery. The incidence of pneumonia as a complication was significantly higher in non-operative management group (p = 0.05). From the viewpoints of early respiratory stabilization and intensive care unit disposition without any complications, surgical rib fixation is a sufficiently acceptable procedure not only for flail chest but also for repair of severe multiple rib fractures.
Spectral analysis method for detecting an element
Blackwood, Larry G [Idaho Falls, ID; Edwards, Andrew J [Idaho Falls, ID; Jewell, James K [Idaho Falls, ID; Reber, Edward L [Idaho Falls, ID; Seabury, Edward H [Idaho Falls, ID
2008-02-12
A method for detecting an element is described and which includes the steps of providing a gamma-ray spectrum which has a region of interest which corresponds with a small amount of an element to be detected; providing nonparametric assumptions about a shape of the gamma-ray spectrum in the region of interest, and which would indicate the presence of the element to be detected; and applying a statistical test to the shape of the gamma-ray spectrum based upon the nonparametric assumptions to detect the small amount of the element to be detected.
Normative data for distal line bisection and baking tray task.
Facchin, Alessio; Beschin, Nicoletta; Pisano, Alessia; Reverberi, Cristina
2016-09-01
Line bisection is one of the tests used to diagnose unilateral spatial neglect (USN). Despite its wide application, no procedure or norms were available for the distal variant when the task was performed at distance with a laser pointer. Furthermore, the baking tray task was an ecological test aimed at diagnosing USN in a more natural context. The aim of this study was to collect normative values for these two tests in an Italian population. We recruited a sample of 191 healthy subjects with ages ranging from 20 to 89 years. They performed line bisection with a laser pointer on three different line lengths (1, 1.5, and 2 m) at a distance of 3 m. After this task, the subjects performed the baking tray task and a second repetition of line bisection to test the reliability of measurement. Multiple regression analysis revealed no significant effects of demographic variables on the performance of both tests. Normative cut-off values for the two tests were developed using non-parametric tolerance intervals. The results formed the basis for clinical use of these two tools for assessing lateralized performance of patients with brain injury and for diagnosing USN.
Penalized nonparametric scalar-on-function regression via principal coordinates
Reiss, Philip T.; Miller, David L.; Wu, Pei-Shien; Hua, Wen-Yu
2016-01-01
A number of classical approaches to nonparametric regression have recently been extended to the case of functional predictors. This paper introduces a new method of this type, which extends intermediate-rank penalized smoothing to scalar-on-function regression. In the proposed method, which we call principal coordinate ridge regression, one regresses the response on leading principal coordinates defined by a relevant distance among the functional predictors, while applying a ridge penalty. Our publicly available implementation, based on generalized additive modeling software, allows for fast optimal tuning parameter selection and for extensions to multiple functional predictors, exponential family-valued responses, and mixed-effects models. In an application to signature verification data, principal coordinate ridge regression, with dynamic time warping distance used to define the principal coordinates, is shown to outperform a functional generalized linear model. PMID:29217963
Nonparametric Combinatorial Sequence Models
NASA Astrophysics Data System (ADS)
Wauthier, Fabian L.; Jordan, Michael I.; Jojic, Nebojsa
This work considers biological sequences that exhibit combinatorial structures in their composition: groups of positions of the aligned sequences are "linked" and covary as one unit across sequences. If multiple such groups exist, complex interactions can emerge between them. Sequences of this kind arise frequently in biology but methodologies for analyzing them are still being developed. This paper presents a nonparametric prior on sequences which allows combinatorial structures to emerge and which induces a posterior distribution over factorized sequence representations. We carry out experiments on three sequence datasets which indicate that combinatorial structures are indeed present and that combinatorial sequence models can more succinctly describe them than simpler mixture models. We conclude with an application to MHC binding prediction which highlights the utility of the posterior distribution induced by the prior. By integrating out the posterior our method compares favorably to leading binding predictors.
A Bayesian Nonparametric Approach to Image Super-Resolution.
Polatkan, Gungor; Zhou, Mingyuan; Carin, Lawrence; Blei, David; Daubechies, Ingrid
2015-02-01
Super-resolution methods form high-resolution images from low-resolution images. In this paper, we develop a new Bayesian nonparametric model for super-resolution. Our method uses a beta-Bernoulli process to learn a set of recurring visual patterns, called dictionary elements, from the data. Because it is nonparametric, the number of elements found is also determined from the data. We test the results on both benchmark and natural images, comparing with several other models from the research literature. We perform large-scale human evaluation experiments to assess the visual quality of the results. In a first implementation, we use Gibbs sampling to approximate the posterior. However, this algorithm is not feasible for large-scale data. To circumvent this, we then develop an online variational Bayes (VB) algorithm. This algorithm finds high quality dictionaries in a fraction of the time needed by the Gibbs sampler.
Applications of non-parametric statistics and analysis of variance on sample variances
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Nonparametric methods that are available for NASA-type applications are discussed. An attempt will be made here to survey what can be used, to attempt recommendations as to when each would be applicable, and to compare the methods, when possible, with the usual normal-theory procedures that are avavilable for the Gaussion analog. It is important here to point out the hypotheses that are being tested, the assumptions that are being made, and limitations of the nonparametric procedures. The appropriateness of doing analysis of variance on sample variances are also discussed and studied. This procedure is followed in several NASA simulation projects. On the surface this would appear to be reasonably sound procedure. However, difficulties involved center around the normality problem and the basic homogeneous variance assumption that is mase in usual analysis of variance problems. These difficulties discussed and guidelines given for using the methods.
Order-restricted inference for means with missing values.
Wang, Heng; Zhong, Ping-Shou
2017-09-01
Missing values appear very often in many applications, but the problem of missing values has not received much attention in testing order-restricted alternatives. Under the missing at random (MAR) assumption, we impute the missing values nonparametrically using kernel regression. For data with imputation, the classical likelihood ratio test designed for testing the order-restricted means is no longer applicable since the likelihood does not exist. This article proposes a novel method for constructing test statistics for assessing means with an increasing order or a decreasing order based on jackknife empirical likelihood (JEL) ratio. It is shown that the JEL ratio statistic evaluated under the null hypothesis converges to a chi-bar-square distribution, whose weights depend on missing probabilities and nonparametric imputation. Simulation study shows that the proposed test performs well under various missing scenarios and is robust for normally and nonnormally distributed data. The proposed method is applied to an Alzheimer's disease neuroimaging initiative data set for finding a biomarker for the diagnosis of the Alzheimer's disease. © 2017, The International Biometric Society.
Total recognition discriminability in Huntington's and Alzheimer's disease.
Graves, Lisa V; Holden, Heather M; Delano-Wood, Lisa; Bondi, Mark W; Woods, Steven Paul; Corey-Bloom, Jody; Salmon, David P; Delis, Dean C; Gilbert, Paul E
2017-03-01
Both the original and second editions of the California Verbal Learning Test (CVLT) provide an index of total recognition discriminability (TRD) but respectively utilize nonparametric and parametric formulas to compute the index. However, the degree to which population differences in TRD may vary across applications of these nonparametric and parametric formulas has not been explored. We evaluated individuals with Huntington's disease (HD), individuals with Alzheimer's disease (AD), healthy middle-aged adults, and healthy older adults who were administered the CVLT-II. Yes/no recognition memory indices were generated, including raw nonparametric TRD scores (as used in CVLT-I) and raw and standardized parametric TRD scores (as used in CVLT-II), as well as false positive (FP) rates. Overall, the patient groups had significantly lower TRD scores than their comparison groups. The application of nonparametric and parametric formulas resulted in comparable effect sizes for all group comparisons on raw TRD scores. Relative to the HD group, the AD group showed comparable standardized parametric TRD scores (despite lower raw nonparametric and parametric TRD scores), whereas the previous CVLT literature has shown that standardized TRD scores are lower in AD than in HD. Possible explanations for the similarity in standardized parametric TRD scores in the HD and AD groups in the present study are discussed, with an emphasis on the importance of evaluating TRD scores in the context of other indices such as FP rates in an effort to fully capture recognition memory function using the CVLT-II.
Impact of Business Cycles on US Suicide Rates, 1928–2007
Florence, Curtis S.; Quispe-Agnoli, Myriam; Ouyang, Lijing; Crosby, Alexander E.
2011-01-01
Objectives. We examined the associations of overall and age-specific suicide rates with business cycles from 1928 to 2007 in the United States. Methods. We conducted a graphical analysis of changes in suicide rates during business cycles, used nonparametric analyses to test associations between business cycles and suicide rates, and calculated correlations between the national unemployment rate and suicide rates. Results. Graphical analyses showed that the overall suicide rate generally rose during recessions and fell during expansions. Age-specific suicide rates responded differently to recessions and expansions. Nonparametric tests indicated that the overall suicide rate and the suicide rates of the groups aged 25 to 34 years, 35 to 44 years, 45 to 54 years, and 55 to 64 years rose during contractions and fell during expansions. Suicide rates of the groups aged 15 to 24 years, 65 to 74 years, and 75 years and older did not exhibit this behavior. Correlation results were concordant with all nonparametric results except for the group aged 65 to 74 years. Conclusions. Business cycles may affect suicide rates, although different age groups responded differently. Our findings suggest that public health responses are a necessary component of suicide prevention during recessions. PMID:21493938
ERIC Educational Resources Information Center
Aßmann, Christian; Würbach, Ariane; Goßmann, Solange; Geissler, Ferdinand; Bela, Anika
2017-01-01
Large-scale surveys typically exhibit data structures characterized by rich mutual dependencies between surveyed variables and individual-specific skip patterns. Despite high efforts in fieldwork and questionnaire design, missing values inevitably occur. One approach for handling missing values is to provide multiply imputed data sets, thus…
ERIC Educational Resources Information Center
Fidalgo, Angel M.
2011-01-01
Mantel-Haenszel (MH) methods constitute one of the most popular nonparametric differential item functioning (DIF) detection procedures. GMHDIF has been developed to provide an easy-to-use program for conducting DIF analyses. Some of the advantages of this program are that (a) it performs two-stage DIF analyses in multiple groups simultaneously;…
Multi-frequency Phase Unwrap from Noisy Data: Adaptive Least Squares Approach
NASA Astrophysics Data System (ADS)
Katkovnik, Vladimir; Bioucas-Dias, José
2010-04-01
Multiple frequency interferometry is, basically, a phase acquisition strategy aimed at reducing or eliminating the ambiguity of the wrapped phase observations or, equivalently, reducing or eliminating the fringe ambiguity order. In multiple frequency interferometry, the phase measurements are acquired at different frequencies (or wavelengths) and recorded using the corresponding sensors (measurement channels). Assuming that the absolute phase to be reconstructed is piece-wise smooth, we use a nonparametric regression technique for the phase reconstruction. The nonparametric estimates are derived from a local least squares criterion, which, when applied to the multifrequency data, yields denoised (filtered) phase estimates with extended ambiguity (periodized), compared with the phase ambiguities inherent to each measurement frequency. The filtering algorithm is based on local polynomial (LPA) approximation for design of nonlinear filters (estimators) and adaptation of these filters to unknown smoothness of the spatially varying absolute phase [9]. For phase unwrapping, from filtered periodized data, we apply the recently introduced robust (in the sense of discontinuity preserving) PUMA unwrapping algorithm [1]. Simulations give evidence that the proposed algorithm yields state-of-the-art performance for continuous as well as for discontinues phase surfaces, enabling phase unwrapping in extraordinary difficult situations when all other algorithms fail.
Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.
2016-01-01
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
A functional U-statistic method for association analysis of sequencing data.
Jadhav, Sneha; Tong, Xiaoran; Lu, Qing
2017-11-01
Although sequencing studies hold great promise for uncovering novel variants predisposing to human diseases, the high dimensionality of the sequencing data brings tremendous challenges to data analysis. Moreover, for many complex diseases (e.g., psychiatric disorders) multiple related phenotypes are collected. These phenotypes can be different measurements of an underlying disease, or measurements characterizing multiple related diseases for studying common genetic mechanism. Although jointly analyzing these phenotypes could potentially increase the power of identifying disease-associated genes, the different types of phenotypes pose challenges for association analysis. To address these challenges, we propose a nonparametric method, functional U-statistic method (FU), for multivariate analysis of sequencing data. It first constructs smooth functions from individuals' sequencing data, and then tests the association of these functions with multiple phenotypes by using a U-statistic. The method provides a general framework for analyzing various types of phenotypes (e.g., binary and continuous phenotypes) with unknown distributions. Fitting the genetic variants within a gene using a smoothing function also allows us to capture complexities of gene structure (e.g., linkage disequilibrium, LD), which could potentially increase the power of association analysis. Through simulations, we compared our method to the multivariate outcome score test (MOST), and found that our test attained better performance than MOST. In a real data application, we apply our method to the sequencing data from Minnesota Twin Study (MTS) and found potential associations of several nicotine receptor subunit (CHRN) genes, including CHRNB3, associated with nicotine dependence and/or alcohol dependence. © 2017 WILEY PERIODICALS, INC.
A Systematic Review of Global Drivers of Ant Elevational Diversity
Szewczyk, Tim; McCain, Christy M.
2016-01-01
Ant diversity shows a variety of patterns across elevational gradients, though the patterns and drivers have not been evaluated comprehensively. In this systematic review and reanalysis, we use published data on ant elevational diversity to detail the observed patterns and to test the predictions and interactions of four major diversity hypotheses: thermal energy, the mid-domain effect, area, and the elevational climate model. Of sixty-seven published datasets from the literature, only those with standardized, comprehensive sampling were used. Datasets included both local and regional ant diversity and spanned 80° in latitude across six biogeographical provinces. We used a combination of simulations, linear regressions, and non-parametric statistics to test multiple quantitative predictions of each hypothesis. We used an environmentally and geometrically constrained model as well as multiple regression to test their interactions. Ant diversity showed three distinct patterns across elevations: most common were hump-shaped mid-elevation peaks in diversity, followed by low-elevation plateaus and monotonic decreases in the number of ant species. The elevational climate model, which proposes that temperature and precipitation jointly drive diversity, and area were partially supported as independent drivers. Thermal energy and the mid-domain effect were not supported as primary drivers of ant diversity globally. The interaction models supported the influence of multiple drivers, though not a consistent set. In contrast to many vertebrate taxa, global ant elevational diversity patterns appear more complex, with the best environmental model contingent on precipitation levels. Differences in ecology and natural history among taxa may be crucial to the processes influencing broad-scale diversity patterns. PMID:27175999
Linear combination methods to improve diagnostic/prognostic accuracy on future observations
Kang, Le; Liu, Aiyi; Tian, Lili
2014-01-01
Multiple diagnostic tests or biomarkers can be combined to improve diagnostic accuracy. The problem of finding the optimal linear combinations of biomarkers to maximise the area under the receiver operating characteristic curve has been extensively addressed in the literature. The purpose of this article is threefold: (1) to provide an extensive review of the existing methods for biomarker combination; (2) to propose a new combination method, namely, the nonparametric stepwise approach; (3) to use leave-one-pair-out cross-validation method, instead of re-substitution method, which is overoptimistic and hence might lead to wrong conclusion, to empirically evaluate and compare the performance of different linear combination methods in yielding the largest area under receiver operating characteristic curve. A data set of Duchenne muscular dystrophy was analysed to illustrate the applications of the discussed combination methods. PMID:23592714
Langer, S L; Vargas, V M F; Flores-Lopes, F; Malabarba, L R
2009-05-01
Manifestation of infectious pathologies in fishes usually increases in environments where organic wastes are disposed. Specimens of Mugil platanus Günther, 1880 and water samples collected at three points of the Tramandaí river were analyzed during a one year period. The macroscopic observation revealed ulcerations in the caudal peduncle area covered with a mass of amorphous and whitened tissues. Histopathologic analysis showed the presence of negative gram bacteria, probably responsible for alterations of the normal structure of the epidermic tissues. Non-parametric statistical analysis for ammonia concentration showed a significant variation among the three collected spots as well as in the multiple comparison between two spots. In this study, we describe cutaneous lesions observed in Mugil platanus specimens and tested their correlation with environmental ammonia concentration.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brown, C; Adcock, A; Azevedo, S
2010-12-28
Some diagnostics at the National Ignition Facility (NIF), including the Gamma Reaction History (GRH) diagnostic, require multiple channels of data to achieve the required dynamic range. These channels need to be stitched together into a single time series, and they may have non-uniform and redundant time samples. We chose to apply the popular cubic smoothing spline technique to our stitching problem because we needed a general non-parametric method. We adapted one of the algorithms in the literature, by Hutchinson and deHoog, to our needs. The modified algorithm and the resulting code perform a cubic smoothing spline fit to multiple datamore » channels with redundant time samples and missing data points. The data channels can have different, time-varying, zero-mean white noise characteristics. The method we employ automatically determines an optimal smoothing level by minimizing the Generalized Cross Validation (GCV) score. In order to automatically validate the smoothing level selection, the Weighted Sum-Squared Residual (WSSR) and zero-mean tests are performed on the residuals. Further, confidence intervals, both analytical and Monte Carlo, are also calculated. In this paper, we describe the derivation of our cubic smoothing spline algorithm. We outline the algorithm and test it with simulated and experimental data.« less
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation.
Mourad, Raphaël; Cuvier, Olivier
2016-05-01
Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1.
Computational Identification of Genomic Features That Influence 3D Chromatin Domain Formation
Mourad, Raphaël; Cuvier, Olivier
2016-01-01
Recent advances in long-range Hi-C contact mapping have revealed the importance of the 3D structure of chromosomes in gene expression. A current challenge is to identify the key molecular drivers of this 3D structure. Several genomic features, such as architectural proteins and functional elements, were shown to be enriched at topological domain borders using classical enrichment tests. Here we propose multiple logistic regression to identify those genomic features that positively or negatively influence domain border establishment or maintenance. The model is flexible, and can account for statistical interactions among multiple genomic features. Using both simulated and real data, we show that our model outperforms enrichment test and non-parametric models, such as random forests, for the identification of genomic features that influence domain borders. Using Drosophila Hi-C data at a very high resolution of 1 kb, our model suggests that, among architectural proteins, BEAF-32 and CP190 are the main positive drivers of 3D domain borders. In humans, our model identifies well-known architectural proteins CTCF and cohesin, as well as ZNF143 and Polycomb group proteins as positive drivers of domain borders. The model also reveals the existence of several negative drivers that counteract the presence of domain borders including P300, RXRA, BCL11A and ELK1. PMID:27203237
Using R to Simulate Permutation Distributions for Some Elementary Experimental Designs
ERIC Educational Resources Information Center
Eudey, T. Lynn; Kerr, Joshua D.; Trumbo, Bruce E.
2010-01-01
Null distributions of permutation tests for two-sample, paired, and block designs are simulated using the R statistical programming language. For each design and type of data, permutation tests are compared with standard normal-theory and nonparametric tests. These examples (often using real data) provide for classroom discussion use of metrics…
NASA Astrophysics Data System (ADS)
Yang, Yang; Peng, Zhike; Dong, Xingjian; Zhang, Wenming; Clifton, David A.
2018-03-01
A challenge in analysing non-stationary multi-component signals is to isolate nonlinearly time-varying signals especially when they are overlapped in time and frequency plane. In this paper, a framework integrating time-frequency analysis-based demodulation and a non-parametric Gaussian latent feature model is proposed to isolate and recover components of such signals. The former aims to remove high-order frequency modulation (FM) such that the latter is able to infer demodulated components while simultaneously discovering the number of the target components. The proposed method is effective in isolating multiple components that have the same FM behavior. In addition, the results show that the proposed method is superior to generalised demodulation with singular-value decomposition-based method, parametric time-frequency analysis with filter-based method and empirical model decomposition base method, in recovering the amplitude and phase of superimposed components.
Constrained multiple indicator kriging using sequential quadratic programming
NASA Astrophysics Data System (ADS)
Soltani-Mohammadi, Saeed; Erhan Tercan, A.
2012-11-01
Multiple indicator kriging (MIK) is a nonparametric method used to estimate conditional cumulative distribution functions (CCDF). Indicator estimates produced by MIK may not satisfy the order relations of a valid CCDF which is ordered and bounded between 0 and 1. In this paper a new method has been presented that guarantees the order relations of the cumulative distribution functions estimated by multiple indicator kriging. The method is based on minimizing the sum of kriging variances for each cutoff under unbiasedness and order relations constraints and solving constrained indicator kriging system by sequential quadratic programming. A computer code is written in the Matlab environment to implement the developed algorithm and the method is applied to the thickness data.
A Method for Assessing Change in Attitude: The McNemar Test.
ERIC Educational Resources Information Center
Ciechalski, Joseph C.; Pinkney, James W.; Weaver, Florence S.
This paper illustrates the use of the McNemar Test, using a hypothetical problem. The McNemar Test is a nonparametric statistical test that is a type of chi square test using dependent, rather than independent, samples to assess before-after designs in which each subject is used as his or her own control. Results of the McNemar test make it…
Borai, Anwar; Ichihara, Kiyoshi; Al Masaud, Abdulaziz; Tamimi, Waleed; Bahijri, Suhad; Armbuster, David; Bawazeer, Ali; Nawajha, Mustafa; Otaibi, Nawaf; Khalil, Haitham; Kawano, Reo; Kaddam, Ibrahim; Abdelaal, Mohamed
2016-05-01
This study is a part of the IFCC-global study to derive reference intervals (RIs) for 28 chemistry analytes in Saudis. Healthy individuals (n=826) aged ≥18 years were recruited using the global study protocol. All specimens were measured using an Architect analyzer. RIs were derived by both parametric and non-parametric methods for comparative purpose. The need for secondary exclusion of reference values based on latent abnormal values exclusion (LAVE) method was examined. The magnitude of variation attributable to gender, ages and regions was calculated by the standard deviation ratio (SDR). Sources of variations: age, BMI, physical exercise and smoking levels were investigated by using the multiple regression analysis. SDRs for gender, age and regional differences were significant for 14, 8 and 2 analytes, respectively. BMI-related changes in test results were noted conspicuously for CRP. For some metabolic related parameters the ranges of RIs by non-parametric method were wider than by the parametric method and RIs derived using the LAVE method were significantly different than those without it. RIs were derived with and without gender partition (BMI, drugs and supplements were considered). RIs applicable to Saudis were established for the majority of chemistry analytes, whereas gender, regional and age RI partitioning was required for some analytes. The elevated upper limits of metabolic analytes reflects the existence of high prevalence of metabolic syndrome in Saudi population.
NASA Astrophysics Data System (ADS)
Dhitareka, P. H.; Firman, H.; Rusyati, L.
2018-05-01
This research is comparing science virtual and paper-based test in measuring grade 7 students’ critical thinking based on Multiple Intelligences and gender. Quasi experimental method with within-subjects design is conducted in this research in order to obtain the data. The population of this research was all seventh grade students in ten classes of one public secondary school in Bandung. There were 71 students within two classes taken randomly became the sample in this research. The data are obtained through 28 questions with a topic of living things and environmental sustainability constructed based on eight critical thinking elements proposed by Inch then the questions provided in science virtual and paper-based test. The data was analysed by using paired-samples t test when the data are parametric and Wilcoxon signed ranks test when the data are non-parametric. In general comparison, the p-value of the comparison between science virtual and paper-based tests’ score is 0.506, indicated that there are no significance difference between science virtual and paper-based test based on the tests’ score. The results are furthermore supported by the students’ attitude result which is 3.15 from the scale from 1 to 4, indicated that they have positive attitudes towards Science Virtual Test.
Zhao, Ni; Chen, Jun; Carroll, Ian M.; Ringel-Kulka, Tamar; Epstein, Michael P.; Zhou, Hua; Zhou, Jin J.; Ringel, Yehuda; Li, Hongzhe; Wu, Michael C.
2015-01-01
High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Distance-based analysis is a popular strategy for evaluating the overall association between microbiome diversity and outcome, wherein the phylogenetic distance between individuals’ microbiome profiles is computed and tested for association via permutation. Despite their practical popularity, distance-based approaches suffer from important challenges, especially in selecting the best distance and extending the methods to alternative outcomes, such as survival outcomes. We propose the microbiome regression-based kernel association test (MiRKAT), which directly regresses the outcome on the microbiome profiles via the semi-parametric kernel machine regression framework. MiRKAT allows for easy covariate adjustment and extension to alternative outcomes while non-parametrically modeling the microbiome through a kernel that incorporates phylogenetic distance. It uses a variance-component score statistic to test for the association with analytical p value calculation. The model also allows simultaneous examination of multiple distances, alleviating the problem of choosing the best distance. Our simulations demonstrated that MiRKAT provides correctly controlled type I error and adequate power in detecting overall association. “Optimal” MiRKAT, which considers multiple candidate distances, is robust in that it suffers from little power loss in comparison to when the best distance is used and can achieve tremendous power gain in comparison to when a poor distance is chosen. Finally, we applied MiRKAT to real microbiome datasets to show that microbial communities are associated with smoking and with fecal protease levels after confounders are controlled for. PMID:25957468
Basagni, Benedetta; Luzzatti, Claudio; Navarrete, Eduardo; Caputo, Marina; Scrocco, Gessica; Damora, Alessio; Giunchi, Laura; Gemignani, Paola; Caiazzo, Annarita; Gambini, Maria Grazia; Avesani, Renato; Mancuso, Mauro; Trojano, Luigi; De Tanti, Antonio
2017-04-01
Verbal reasoning is a complex, multicomponent function, which involves activation of functional processes and neural circuits distributed in both brain hemispheres. Thus, this ability is often impaired after brain injury. The aim of the present study is to describe the construction of a new verbal reasoning test (VRT) for patients with brain injury and to provide normative values in a sample of healthy Italian participants. Three hundred and eighty healthy Italian subjects (193 women and 187 men) of different ages (range 16-75 years) and educational level (primary school to postgraduate degree) underwent the VRT. VRT is composed of seven subtests, investigating seven different domains. Multiple linear regression analysis revealed a significant effect of age and education on the participants' performance in terms of both VRT total score and all seven subtest scores. No gender effect was found. A correction grid for raw scores was built from the linear equation derived from the scores. Inferential cut-off scores were estimated using a non-parametric technique, and equivalent scores were computed. We also provided a grid for the correction of results by z scores.
Long-range dismount activity classification: LODAC
NASA Astrophysics Data System (ADS)
Garagic, Denis; Peskoe, Jacob; Liu, Fang; Cuevas, Manuel; Freeman, Andrew M.; Rhodes, Bradley J.
2014-06-01
Continuous classification of dismount types (including gender, age, ethnicity) and their activities (such as walking, running) evolving over space and time is challenging. Limited sensor resolution (often exacerbated as a function of platform standoff distance) and clutter from shadows in dense target environments, unfavorable environmental conditions, and the normal properties of real data all contribute to the challenge. The unique and innovative aspect of our approach is a synthesis of multimodal signal processing with incremental non-parametric, hierarchical Bayesian machine learning methods to create a new kind of target classification architecture. This architecture is designed from the ground up to optimally exploit correlations among the multiple sensing modalities (multimodal data fusion) and rapidly and continuously learns (online self-tuning) patterns of distinct classes of dismounts given little a priori information. This increases classification performance in the presence of challenges posed by anti-access/area denial (A2/AD) sensing. To fuse multimodal features, Long-range Dismount Activity Classification (LODAC) develops a novel statistical information theoretic approach for multimodal data fusion that jointly models multimodal data (i.e., a probabilistic model for cross-modal signal generation) and discovers the critical cross-modal correlations by identifying components (features) with maximal mutual information (MI) which is efficiently estimated using non-parametric entropy models. LODAC develops a generic probabilistic pattern learning and classification framework based on a new class of hierarchical Bayesian learning algorithms for efficiently discovering recurring patterns (classes of dismounts) in multiple simultaneous time series (sensor modalities) at multiple levels of feature granularity.
Lu, Tao
2016-01-01
The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.
Harlander, Niklas; Rosenkranz, Tobias; Hohmann, Volker
2012-08-01
Single channel noise reduction has been well investigated and seems to have reached its limits in terms of speech intelligibility improvement, however, the quality of such schemes can still be advanced. This study tests to what extent novel model-based processing schemes might improve performance in particular for non-stationary noise conditions. Two prototype model-based algorithms, a speech-model-based, and a auditory-model-based algorithm were compared to a state-of-the-art non-parametric minimum statistics algorithm. A speech intelligibility test, preference rating, and listening effort scaling were performed. Additionally, three objective quality measures for the signal, background, and overall distortions were applied. For a better comparison of all algorithms, particular attention was given to the usage of the similar Wiener-based gain rule. The perceptual investigation was performed with fourteen hearing-impaired subjects. The results revealed that the non-parametric algorithm and the auditory model-based algorithm did not affect speech intelligibility, whereas the speech-model-based algorithm slightly decreased intelligibility. In terms of subjective quality, both model-based algorithms perform better than the unprocessed condition and the reference in particular for highly non-stationary noise environments. Data support the hypothesis that model-based algorithms are promising for improving performance in non-stationary noise conditions.
Out-of-Sample Extensions for Non-Parametric Kernel Methods.
Pan, Binbin; Chen, Wen-Sheng; Chen, Bo; Xu, Chen; Lai, Jianhuang
2017-02-01
Choosing suitable kernels plays an important role in the performance of kernel methods. Recently, a number of studies were devoted to developing nonparametric kernels. Without assuming any parametric form of the target kernel, nonparametric kernel learning offers a flexible scheme to utilize the information of the data, which may potentially characterize the data similarity better. The kernel methods using nonparametric kernels are referred to as nonparametric kernel methods. However, many nonparametric kernel methods are restricted to transductive learning, where the prediction function is defined only over the data points given beforehand. They have no straightforward extension for the out-of-sample data points, and thus cannot be applied to inductive learning. In this paper, we show how to make the nonparametric kernel methods applicable to inductive learning. The key problem of out-of-sample extension is how to extend the nonparametric kernel matrix to the corresponding kernel function. A regression approach in the hyper reproducing kernel Hilbert space is proposed to solve this problem. Empirical results indicate that the out-of-sample performance is comparable to the in-sample performance in most cases. Experiments on face recognition demonstrate the superiority of our nonparametric kernel method over the state-of-the-art parametric kernel methods.
ERIC Educational Resources Information Center
Stephens, Torrance; Braithwaite, Harold; Johnson, Larry; Harris, Catrell; Katkowsky, Steven; Troutman, Adewale
2008-01-01
Objective: To examine impact of CVD risk reduction intervention for African-American men in the Atlanta Empowerment Zone (AEZ) designed to target anger management. Design: Wilcoxon Signed-Rank Test was employed as a non-parametric alternative to the t-test for independent samples. This test was employed because the data used in this analysis…
Can Percentiles Replace Raw Scores in the Statistical Analysis of Test Data?
ERIC Educational Resources Information Center
Zimmerman, Donald W.; Zumbo, Bruno D.
2005-01-01
Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…
A Comparison of Bias Correction Adjustments for the DETECT Procedure
ERIC Educational Resources Information Center
Nandakumar, Ratna; Yu, Feng; Zhang, Yanwei
2011-01-01
DETECT is a nonparametric methodology to identify the dimensional structure underlying test data. The associated DETECT index, "D[subscript max]," denotes the degree of multidimensionality in data. Conditional covariances (CCOV) are the building blocks of this index. In specifying population CCOVs, the latent test composite [theta][subscript TT]…
Rio, Daniel E.; Rawlings, Robert R.; Woltz, Lawrence A.; Gilman, Jodi; Hommer, Daniel W.
2013-01-01
A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function. PMID:23840281
Rio, Daniel E; Rawlings, Robert R; Woltz, Lawrence A; Gilman, Jodi; Hommer, Daniel W
2013-01-01
A linear time-invariant model based on statistical time series analysis in the Fourier domain for single subjects is further developed and applied to functional MRI (fMRI) blood-oxygen level-dependent (BOLD) multivariate data. This methodology was originally developed to analyze multiple stimulus input evoked response BOLD data. However, to analyze clinical data generated using a repeated measures experimental design, the model has been extended to handle multivariate time series data and demonstrated on control and alcoholic subjects taken from data previously analyzed in the temporal domain. Analysis of BOLD data is typically carried out in the time domain where the data has a high temporal correlation. These analyses generally employ parametric models of the hemodynamic response function (HRF) where prewhitening of the data is attempted using autoregressive (AR) models for the noise. However, this data can be analyzed in the Fourier domain. Here, assumptions made on the noise structure are less restrictive, and hypothesis tests can be constructed based on voxel-specific nonparametric estimates of the hemodynamic transfer function (HRF in the Fourier domain). This is especially important for experimental designs involving multiple states (either stimulus or drug induced) that may alter the form of the response function.
Spinal cord atrophy in anterior-posterior direction reflects impairment in multiple sclerosis.
Lundell, H; Svolgaard, O; Dogonowski, A-M; Romme Christensen, J; Selleberg, F; Soelberg Sørensen, P; Blinkenberg, M; Siebner, H R; Garde, E
2017-10-01
To investigate how atrophy is distributed over the cross section of the upper cervical spinal cord and how this relates to functional impairment in multiple sclerosis (MS). We analysed the structural brain MRI scans of 54 patients with relapsing-remitting MS (n=22), primary progressive MS (n=9), secondary progressive MS (n=23) and 23 age- and sex-matched healthy controls. We measured the cross-sectional area (CSA), left-right width (LRW) and anterior-posterior width (APW) of the spinal cord at the segmental level C2. We tested for a nonparametric linear relationship between these atrophy measures and clinical impairments as reflected by the Expanded Disability Status Scale (EDSS) and Multiple Sclerosis Impairment Scale (MSIS). In patients with MS, CSA and APW but not LRW were reduced compared to healthy controls (P<.02) and showed significant correlations with EDSS, MSIS and specific MSIS subscores. In patients with MS, atrophy of the upper cervical cord is most evident in the antero-posterior direction. As APW of the cervical cord can be readily derived from standard structural MRI of the brain, APW constitutes a clinically useful neuroimaging marker of disease-related neurodegeneration in MS. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Sarkar, Rajarshi
2013-07-01
The validity of the entire renal function tests as a diagnostic tool depends substantially on the Biological Reference Interval (BRI) of urea. Establishment of BRI of urea is difficult partly because exclusion criteria for selection of reference data are quite rigid and partly due to the compartmentalization considerations regarding age and sex of the reference individuals. Moreover, construction of Biological Reference Curve (BRC) of urea is imperative to highlight the partitioning requirements. This a priori study examines the data collected by measuring serum urea of 3202 age and sex matched individuals, aged between 1 and 80 years, by a kinetic UV Urease/GLDH method on a Roche Cobas 6000 auto-analyzer. Mann-Whitney U test of the reference data confirmed the partitioning requirement by both age and sex. Further statistical analysis revealed the incompatibility of the data for a proposed parametric model. Hence the data was non-parametrically analysed. BRI was found to be identical for both sexes till the 2(nd) decade, and the BRI for males increased progressively 6(th) decade onwards. Four non-parametric models were postulated for construction of BRC: Gaussian kernel, double kernel, local mean and local constant, of which the last one generated the best-fitting curves. Clinical decision making should become easier and diagnostic implications of renal function tests should become more meaningful if this BRI is followed and the BRC is used as a desktop tool in conjunction with similar data for serum creatinine.
Generalized Correlation Coefficient for Non-Parametric Analysis of Microarray Time-Course Data.
Tan, Qihua; Thomassen, Mads; Burton, Mark; Mose, Kristian Fredløv; Andersen, Klaus Ejner; Hjelmborg, Jacob; Kruse, Torben
2017-06-06
Modeling complex time-course patterns is a challenging issue in microarray study due to complex gene expression patterns in response to the time-course experiment. We introduce the generalized correlation coefficient and propose a combinatory approach for detecting, testing and clustering the heterogeneous time-course gene expression patterns. Application of the method identified nonlinear time-course patterns in high agreement with parametric analysis. We conclude that the non-parametric nature in the generalized correlation analysis could be an useful and efficient tool for analyzing microarray time-course data and for exploring the complex relationships in the omics data for studying their association with disease and health.
NASA Astrophysics Data System (ADS)
Kovalenko, I. D.; Doressoundiram, A.; Lellouch, E.; Vilenius, E.; Müller, T.; Stansberry, J.
2017-11-01
Context. Gravitationally bound multiple systems provide an opportunity to estimate the mean bulk density of the objects, whereas this characteristic is not available for single objects. Being a primitive population of the outer solar system, binary and multiple trans-Neptunian objects (TNOs) provide unique information about bulk density and internal structure, improving our understanding of their formation and evolution. Aims: The goal of this work is to analyse parameters of multiple trans-Neptunian systems, observed with Herschel and Spitzer space telescopes. Particularly, statistical analysis is done for radiometric size and geometric albedo, obtained from photometric observations, and for estimated bulk density. Methods: We use Monte Carlo simulation to estimate the real size distribution of TNOs. For this purpose, we expand the dataset of diameters by adopting the Minor Planet Center database list with available values of the absolute magnitude therein, and the albedo distribution derived from Herschel radiometric measurements. We use the 2-sample Anderson-Darling non-parametric statistical method for testing whether two samples of diameters, for binary and single TNOs, come from the same distribution. Additionally, we use the Spearman's coefficient as a measure of rank correlations between parameters. Uncertainties of estimated parameters together with lack of data are taken into account. Conclusions about correlations between parameters are based on statistical hypothesis testing. Results: We have found that the difference in size distributions of multiple and single TNOs is biased by small objects. The test on correlations between parameters shows that the effective diameter of binary TNOs strongly correlates with heliocentric orbital inclination and with magnitude difference between components of binary system. The correlation between diameter and magnitude difference implies that small and large binaries are formed by different mechanisms. Furthermore, the statistical test indicates, although not significant with the sample size, that a moderately strong correlation exists between diameter and bulk density. Herschel is an ESA space observatory with science instruments provided by European-led Principal Investigator consortia and with important participation from NASA.
Nonparametric tests for equality of psychometric functions.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2017-12-07
Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.
NASA Astrophysics Data System (ADS)
Constantinescu, C. C.; Yoder, K. K.; Kareken, D. A.; Bouman, C. A.; O'Connor, S. J.; Normandin, M. D.; Morris, E. D.
2008-03-01
We previously developed a model-independent technique (non-parametric ntPET) for extracting the transient changes in neurotransmitter concentration from paired (rest & activation) PET studies with a receptor ligand. To provide support for our method, we introduced three hypotheses of validation based on work by Endres and Carson (1998 J. Cereb. Blood Flow Metab. 18 1196-210) and Yoder et al (2004 J. Nucl. Med. 45 903-11), and tested them on experimental data. All three hypotheses describe relationships between the estimated free (synaptic) dopamine curves (FDA(t)) and the change in binding potential (ΔBP). The veracity of the FDA(t) curves recovered by nonparametric ntPET is supported when the data adhere to the following hypothesized behaviors: (1) ΔBP should decline with increasing DA peak time, (2) ΔBP should increase as the strength of the temporal correlation between FDA(t) and the free raclopride (FRAC(t)) curve increases, (3) ΔBP should decline linearly with the effective weighted availability of the receptor sites. We analyzed regional brain data from 8 healthy subjects who received two [11C]raclopride scans: one at rest, and one during which unanticipated IV alcohol was administered to stimulate dopamine release. For several striatal regions, nonparametric ntPET was applied to recover FDA(t), and binding potential values were determined. Kendall rank-correlation analysis confirmed that the FDA(t) data followed the expected trends for all three validation hypotheses. Our findings lend credence to our model-independent estimates of FDA(t). Application of nonparametric ntPET may yield important insights into how alterations in timing of dopaminergic neurotransmission are involved in the pathologies of addiction and other psychiatric disorders.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aghamousa, Amir; Shafieloo, Arman; Arjunwadkar, Mihir
2015-02-01
Estimation of the angular power spectrum is one of the important steps in Cosmic Microwave Background (CMB) data analysis. Here, we present a nonparametric estimate of the temperature angular power spectrum for the Planck 2013 CMB data. The method implemented in this work is model-independent, and allows the data, rather than the model, to dictate the fit. Since one of the main targets of our analysis is to test the consistency of the ΛCDM model with Planck 2013 data, we use the nuisance parameters associated with the best-fit ΛCDM angular power spectrum to remove foreground contributions from the data atmore » multipoles ℓ ≥50. We thus obtain a combined angular power spectrum data set together with the full covariance matrix, appropriately weighted over frequency channels. Our subsequent nonparametric analysis resolves six peaks (and five dips) up to ℓ ∼1850 in the temperature angular power spectrum. We present uncertainties in the peak/dip locations and heights at the 95% confidence level. We further show how these reflect the harmonicity of acoustic peaks, and can be used for acoustic scale estimation. Based on this nonparametric formalism, we found the best-fit ΛCDM model to be at 36% confidence distance from the center of the nonparametric confidence set—this is considerably larger than the confidence distance (9%) derived earlier from a similar analysis of the WMAP 7-year data. Another interesting result of our analysis is that at low multipoles, the Planck data do not suggest any upturn, contrary to the expectation based on the integrated Sachs-Wolfe contribution in the best-fit ΛCDM cosmology.« less
Eisinga, Rob; Heskes, Tom; Pelzer, Ben; Te Grotenhuis, Manfred
2017-01-25
The Friedman rank sum test is a widely-used nonparametric method in computational biology. In addition to examining the overall null hypothesis of no significant difference among any of the rank sums, it is typically of interest to conduct pairwise comparison tests. Current approaches to such tests rely on large-sample approximations, due to the numerical complexity of computing the exact distribution. These approximate methods lead to inaccurate estimates in the tail of the distribution, which is most relevant for p-value calculation. We propose an efficient, combinatorial exact approach for calculating the probability mass distribution of the rank sum difference statistic for pairwise comparison of Friedman rank sums, and compare exact results with recommended asymptotic approximations. Whereas the chi-squared approximation performs inferiorly to exact computation overall, others, particularly the normal, perform well, except for the extreme tail. Hence exact calculation offers an improvement when small p-values occur following multiple testing correction. Exact inference also enhances the identification of significant differences whenever the observed values are close to the approximate critical value. We illustrate the proposed method in the context of biological machine learning, were Friedman rank sum difference tests are commonly used for the comparison of classifiers over multiple datasets. We provide a computationally fast method to determine the exact p-value of the absolute rank sum difference of a pair of Friedman rank sums, making asymptotic tests obsolete. Calculation of exact p-values is easy to implement in statistical software and the implementation in R is provided in one of the Additional files and is also available at http://www.ru.nl/publish/pages/726696/friedmanrsd.zip .
Karakaya, Jale; Karabulut, Erdem; Yucel, Recai M.
2015-01-01
Modern statistical methods using incomplete data have been increasingly applied in a wide variety of substantive problems. Similarly, receiver operating characteristic (ROC) analysis, a method used in evaluating diagnostic tests or biomarkers in medical research, has also been increasingly popular problem in both its development and application. While missing-data methods have been applied in ROC analysis, the impact of model mis-specification and/or assumptions (e.g. missing at random) underlying the missing data has not been thoroughly studied. In this work, we study the performance of multiple imputation (MI) inference in ROC analysis. Particularly, we investigate parametric and non-parametric techniques for MI inference under common missingness mechanisms. Depending on the coherency of the imputation model with the underlying data generation mechanism, our results show that MI generally leads to well-calibrated inferences under ignorable missingness mechanisms. PMID:26379316
Plante, David T.; Landsness, Eric C.; Peterson, Michael J.; Goldstein, Michael R.; Wanger, Tim; Guokas, Jeff J.; Tononi, Giulio; Benca, Ruth M.
2012-01-01
Hypersomnolence in major depressive disorder (MDD) plays an important role in the natural history of the disorder, but the basis of hypersomnia in MDD is poorly understood. Slow wave activity (SWA) has been associated with sleep homeostasis, as well as sleep restoration and maintenance, and may be altered in MDD. Therefore, we conducted a post-hoc study that utilized high density electroencephalography (hdEEG) to test the hypothesis that MDD subjects with hypersomnia (HYS+) would have decreased SWA relative to age and sex-matched MDD subjects without hypersomnia (HYS−) and healthy controls (n=7 for each group). After correcting for multiple comparisons using statistical non-parametric mapping, HYS+ subjects demonstrated significantly reduced parieto-occipital all-night SWA relative to HYS− subjects. Our results suggest hypersomnolence may be associated with topographic reductions in SWA in MDD. Further research using adequately powered prospective design is indicated to confirm these findings. PMID:22512951
Teaching Communication Skills to Medical and Pharmacy Students Through a Blended Learning Course.
Hess, Rick; Hagemeier, Nicholas E; Blackwelder, Reid; Rose, Daniel; Ansari, Nasar; Branham, Tandy
2016-05-25
Objective. To evaluate the impact of an interprofessional blended learning course on medical and pharmacy students' patient-centered interpersonal communication skills and to compare precourse and postcourse communication skills across first-year medical and second-year pharmacy student cohorts. Methods. Students completed ten 1-hour online modules and participated in five 3-hour group sessions over one semester. Objective structured clinical examinations (OSCEs) were administered before and after the course and were evaluated using the validated Common Ground Instrument. Nonparametric statistical tests were used to examine pre/postcourse domain scores within and across professions. Results. Performance in all communication skill domains increased significantly for all students. No additional significant pre/postcourse differences were noted across disciplines. Conclusion. Students' patient-centered interpersonal communication skills improved across multiple domains using a blended learning educational platform. Interview abilities were embodied similarly between medical and pharmacy students postcourse, suggesting both groups respond well to this form of instruction.
FORTRAN implementation of Friedman's test for several related samples
NASA Technical Reports Server (NTRS)
Davidson, S. A.
1982-01-01
The FRIEDMAN program is a FORTRAN-coded implementation of Friedman's nonparametric test for several related samples with one observation per treatment/-block combination, or as it is sometimes called, the two-way analysis of variance by ranks. The FRIEDMAN program is described and a test data set and its results are presented to aid potential users of this program.
Xu, Maoqi; Chen, Liang
2018-01-01
The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
A New Procedure for Detection of Students' Rapid Guessing Responses Using Response Time
ERIC Educational Resources Information Center
Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu
2016-01-01
Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…
Five-Point Likert Items: t Test versus Mann-Whitney-Wilcoxon
ERIC Educational Resources Information Center
de Winter, Joost C. F.; Dodou, Dimitra
2010-01-01
Likert questionnaires are widely used in survey research, but it is unclear whether the item data should be investigated by means of parametric or nonparametric procedures. This study compared the Type I and II error rates of the "t" test versus the Mann-Whitney-Wilcoxon (MWW) for five-point Likert items. Fourteen population…
The Impact of Conditional Scores on the Performance of DETECT.
ERIC Educational Resources Information Center
Zhang, Yanwei Oliver; Yu, Feng; Nandakumar, Ratna
DETECT is a nonparametric, conditional covariance-based procedure to identify dimensional structure and the degree of multidimensionality of test data. The ability composite or conditional score used to estimate conditional covariance plays a significant role in the performance of DETECT. The number correct score of all items in the test (T) and…
A Review of DIMPACK Version 1.0: Conditional Covariance-Based Test Dimensionality Analysis Package
ERIC Educational Resources Information Center
Deng, Nina; Han, Kyung T.; Hambleton, Ronald K.
2013-01-01
DIMPACK Version 1.0 for assessing test dimensionality based on a nonparametric conditional covariance approach is reviewed. This software was originally distributed by Assessment Systems Corporation and now can be freely accessed online. The software consists of Windows-based interfaces of three components: DIMTEST, DETECT, and CCPROX/HAC, which…
Conditional Covariance-Based Subtest Selection for DIMTEST
ERIC Educational Resources Information Center
Froelich, Amy G.; Habing, Brian
2008-01-01
DIMTEST is a nonparametric hypothesis-testing procedure designed to test the assumptions of a unidimensional and locally independent item response theory model. Several previous Monte Carlo studies have found that using linear factor analysis to select the assessment subtest for DIMTEST results in a moderate to severe loss of power when the exam…
Temporal changes and variability in temperature series over Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Suhaila, Jamaludin
2015-02-01
With the current concern over climate change, the descriptions on how temperature series changed over time are very useful. Annual mean temperature has been analyzed for several stations over Peninsular Malaysia. Non-parametric statistical techniques such as Mann-Kendall test and Theil-Sen slope estimation are used primarily for assessing the significance and detection of trends, while a nonparametric Pettitt's test and sequential Mann-Kendall test are adopted to detect any abrupt climate change. Statistically significance increasing trends for annual mean temperature are detected for almost all studied stations with the magnitude of significant trend varied from 0.02°C to 0.05°C per year. The results shows that climate over Peninsular Malaysia is getting warmer than before. In addition, the results of the abrupt changes in temperature using Pettitt's and sequential Mann-Kendall test reveal the beginning of trends which can be related to El Nino episodes that occur in Malaysia. In general, the analysis results can help local stakeholders and water managers to understand the risks and vulnerabilities related to climate change in terms of mean events in the region.
Zornoza-Moreno, Matilde; Fuentes-Hernández, Silvia; Sánchez-Solis, Manuel; Rol, María Ángeles; Larqué, Elvira; Madrid, Juan Antonio
2011-05-01
The authors developed a method useful for home measurement of temperature, activity, and sleep rhythms in infants under normal-living conditions during their first 6 mos of life. In addition, parametric and nonparametric tests for assessing circadian system maturation in these infants were compared. Anthropometric parameters plus ankle skin temperature and activity were evaluated in 10 infants by means of two data loggers, Termochron iButton (DS1291H, Maxim Integrated Products, Sunnyvale, CA) for temperature and HOBO Pendant G (Hobo Pendant G Acceleration, UA-004-64, Onset Computer Corporation, Bourne, MA) for motor activity, located in special baby socks specifically designed for the study. Skin temperature and motor activity were recorded over 3 consecutive days at 15 days, 1, 3, and 6 mos of age. Circadian rhythms of skin temperature and motor activity appeared at 3 mos in most babies. Mean skin temperature decreased significantly by 3 mos of life relative to previous measurements (p = .0001), whereas mean activity continued to increase during the first 6 mos. For most of the parameters analyzed, statistically significant changes occurred at 3-6 mos relative to 0.5-1 mo of age. Major differences were found using nonparametric tests. Intradaily variability in motor activity decreased significantly at 6 mos of age relative to previous measurements, and followed a similar trend for temperature; interdaily stability increased significantly at 6 mos of age relative to previous measurements for both variables; relative amplitude increased significantly at 6 mos for temperature and at 3 mos for activity, both with respect to previous measurements. A high degree of correlation was found between chronobiological parametric and nonparametric tests for mean and mesor and also for relative amplitude versus the cosinor-derived amplitude. However, the correlation between parametric and nonparametric equivalent indices (acrophase and midpoint of M5, interdaily stability and Rayleigh test, or intradaily variability and P(1)/P(ultradian)) despite being significant, was lower for both temperature and activity. The circadian function index (CFI index), based on the integrated variable temperature-activity, increased gradually with age and was statistically significant at 6 mos of age. At 6 mos, 90% of the infants' rest period coincided with the standard sleep period of their parents, defined from 23:00 to 07:00 h (dichotomic index I < O; when I < O = 100%, there is a complete coincidence between infant nocturnal rest period and the standard rest period), whereas at 15 days of life the coincidence was only 75%. The combination of thermometry and actimetry using data loggers placed in infants' socks is a reliable method for assessing both variables and also sleep rhythms in infants under ambulatory conditions, with minimal disturbance. Using this methodological approach, circadian rhythms of skin temperature and motor activity appeared by 3 mos in most babies. Nonparametric tests provided more reliable information than cosinor analysis for circadian rhythm assessment in infants.
Statistical analysis of particle trajectories in living cells
NASA Astrophysics Data System (ADS)
Briane, Vincent; Kervrann, Charles; Vimond, Myriam
2018-06-01
Recent advances in molecular biology and fluorescence microscopy imaging have made possible the inference of the dynamics of molecules in living cells. Such inference allows us to understand and determine the organization and function of the cell. The trajectories of particles (e.g., biomolecules) in living cells, computed with the help of object tracking methods, can be modeled with diffusion processes. Three types of diffusion are considered: (i) free diffusion, (ii) subdiffusion, and (iii) superdiffusion. The mean-square displacement (MSD) is generally used to discriminate the three types of particle dynamics. We propose here a nonparametric three-decision test as an alternative to the MSD method. The rejection of the null hypothesis, i.e., free diffusion, is accompanied by claims of the direction of the alternative (subdiffusion or superdiffusion). We study the asymptotic behavior of the test statistic under the null hypothesis and under parametric alternatives which are currently considered in the biophysics literature. In addition, we adapt the multiple-testing procedure of Benjamini and Hochberg to fit with the three-decision-test setting, in order to apply the test procedure to a collection of independent trajectories. The performance of our procedure is much better than the MSD method as confirmed by Monte Carlo experiments. The method is demonstrated on real data sets corresponding to protein dynamics observed in fluorescence microscopy.
Modeling seasonal variation of hip fracture in Montreal, Canada.
Modarres, Reza; Ouarda, Taha B M J; Vanasse, Alain; Orzanco, Maria Gabriela; Gosselin, Pierre
2012-04-01
The investigation of the association of the climate variables with hip fracture incidences is important in social health issues. This study examined and modeled the seasonal variation of monthly population based hip fracture rate (HFr) time series. The seasonal ARIMA time series modeling approach is used to model monthly HFr incidences time series of female and male patients of the ages 40-74 and 75+ of Montreal, Québec province, Canada, in the period of 1993-2004. The correlation coefficients between meteorological variables such as temperature, snow depth, rainfall depth and day length and HFr are significant. The nonparametric Mann-Kendall test for trend assessment and the nonparametric Levene's test and Wilcoxon's test for checking the difference of HFr before and after change point are also used. The seasonality in HFr indicated sharp difference between winter and summer time. The trend assessment showed decreasing trends in HFr of female and male groups. The nonparametric test also indicated a significant change of the mean HFr. A seasonal ARIMA model was applied for HFr time series without trend and a time trend ARIMA model (TT-ARIMA) was developed and fitted to HFr time series with a significant trend. The multi criteria evaluation showed the adequacy of SARIMA and TT-ARIMA models for modeling seasonal hip fracture time series with and without significant trend. In the time series analysis of HFr of the Montreal region, the effects of the seasonal variation of climate variables on hip fracture are clear. The Seasonal ARIMA model is useful for modeling HFr time series without trend. However, for time series with significant trend, the TT-ARIMA model should be applied for modeling HFr time series. Copyright © 2011 Elsevier Inc. All rights reserved.
Shi, Yang; Chinnaiyan, Arul M; Jiang, Hui
2015-07-01
High-throughput sequencing of transcriptomes (RNA-Seq) has become a powerful tool to study gene expression. Here we present an R package, rSeqNP, which implements a non-parametric approach to test for differential expression and splicing from RNA-Seq data. rSeqNP uses permutation tests to access statistical significance and can be applied to a variety of experimental designs. By combining information across isoforms, rSeqNP is able to detect more differentially expressed or spliced genes from RNA-Seq data. The R package with its source code and documentation are freely available at http://www-personal.umich.edu/∼jianghui/rseqnp/. jianghui@umich.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
More evidence of the plateau effect: a social perspective.
Pinto-Prades, J L; Lopez-Nicolás, A
1998-01-01
The purpose of this study was to test the existence of the plateau effect at the social level. The authors tried to confirm the preliminary conclusion that people may not be willing to trade off any longevity to improve the health state of a large number of people if the health states are mild enough. They tested this assumption using the person-tradeoff technique. They also used a parametric approach and a nonparametric approach to study the relationship between individual and social values. Results show the existence of the plateau effect in the context of resource allocation. Furthermore, with the nonparametric approach, a plateau effect in the middle part of the scale was also observed, suggesting that social preference may not be directly predicted from individual utilities. The authors caution against the possible framing effects that may be present in these kinds of questions.
On-Line Robust Modal Stability Prediction using Wavelet Processing
NASA Technical Reports Server (NTRS)
Brenner, Martin J.; Lind, Rick
1998-01-01
Wavelet analysis for filtering and system identification has been used to improve the estimation of aeroservoelastic stability margins. The conservatism of the robust stability margins is reduced with parametric and nonparametric time- frequency analysis of flight data in the model validation process. Nonparametric wavelet processing of data is used to reduce the effects of external disturbances and unmodeled dynamics. Parametric estimates of modal stability are also extracted using the wavelet transform. Computation of robust stability margins for stability boundary prediction depends on uncertainty descriptions derived from the data for model validation. The F-18 High Alpha Research Vehicle aeroservoelastic flight test data demonstrates improved robust stability prediction by extension of the stability boundary beyond the flight regime. Guidelines and computation times are presented to show the efficiency and practical aspects of these procedures for on-line implementation. Feasibility of the method is shown for processing flight data from time- varying nonstationary test points.
Vexler, Albert; Tanajian, Hovig; Hutson, Alan D
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Carroll, Raymond J; Delaigle, Aurore; Hall, Peter
2011-03-01
In many applications we can expect that, or are interested to know if, a density function or a regression curve satisfies some specific shape constraints. For example, when the explanatory variable, X, represents the value taken by a treatment or dosage, the conditional mean of the response, Y , is often anticipated to be a monotone function of X. Indeed, if this regression mean is not monotone (in the appropriate direction) then the medical or commercial value of the treatment is likely to be significantly curtailed, at least for values of X that lie beyond the point at which monotonicity fails. In the case of a density, common shape constraints include log-concavity and unimodality. If we can correctly guess the shape of a curve, then nonparametric estimators can be improved by taking this information into account. Addressing such problems requires a method for testing the hypothesis that the curve of interest satisfies a shape constraint, and, if the conclusion of the test is positive, a technique for estimating the curve subject to the constraint. Nonparametric methodology for solving these problems already exists, but only in cases where the covariates are observed precisely. However in many problems, data can only be observed with measurement errors, and the methods employed in the error-free case typically do not carry over to this error context. In this paper we develop a novel approach to hypothesis testing and function estimation under shape constraints, which is valid in the context of measurement errors. Our method is based on tilting an estimator of the density or the regression mean until it satisfies the shape constraint, and we take as our test statistic the distance through which it is tilted. Bootstrap methods are used to calibrate the test. The constrained curve estimators that we develop are also based on tilting, and in that context our work has points of contact with methodology in the error-free case.
NASA Astrophysics Data System (ADS)
Cannon, Alex
2017-04-01
Estimating historical trends in short-duration rainfall extremes at regional and local scales is challenging due to low signal-to-noise ratios and the limited availability of homogenized observational data. In addition to being of scientific interest, trends in rainfall extremes are of practical importance, as their presence calls into question the stationarity assumptions that underpin traditional engineering and infrastructure design practice. Even with these fundamental challenges, increasingly complex questions are being asked about time series of extremes. For instance, users may not only want to know whether or not rainfall extremes have changed over time, they may also want information on the modulation of trends by large-scale climate modes or on the nonstationarity of trends (e.g., identifying hiatus periods or periods of accelerating positive trends). Efforts have thus been devoted to the development and application of more robust and powerful statistical estimators for regional and local scale trends. While a standard nonparametric method like the regional Mann-Kendall test, which tests for the presence of monotonic trends (i.e., strictly non-decreasing or non-increasing changes), makes fewer assumptions than parametric methods and pools information from stations within a region, it is not designed to visualize detected trends, include information from covariates, or answer questions about the rate of change in trends. As a remedy, monotone quantile regression (MQR) has been developed as a nonparametric alternative that can be used to estimate a common monotonic trend in extremes at multiple stations. Quantile regression makes efficient use of data by directly estimating conditional quantiles based on information from all rainfall data in a region, i.e., without having to precompute the sample quantiles. The MQR method is also flexible and can be used to visualize and analyze the nonlinearity of the detected trend. However, it is fundamentally a univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.
Kuan, Pei Fen; Chiang, Derek Y
2012-09-01
DNA methylation has emerged as an important hallmark of epigenetics. Numerous platforms including tiling arrays and next generation sequencing, and experimental protocols are available for profiling DNA methylation. Similar to other tiling array data, DNA methylation data shares the characteristics of inherent correlation structure among nearby probes. However, unlike gene expression or protein DNA binding data, the varying CpG density which gives rise to CpG island, shore and shelf definition provides exogenous information in detecting differential methylation. This article aims to introduce a robust testing and probe ranking procedure based on a nonhomogeneous hidden Markov model that incorporates the above-mentioned features for detecting differential methylation. We revisit the seminal work of Sun and Cai (2009, Journal of the Royal Statistical Society: Series B (Statistical Methodology)71, 393-424) and propose modeling the nonnull using a nonparametric symmetric distribution in two-sided hypothesis testing. We show that this model improves probe ranking and is robust to model misspecification based on extensive simulation studies. We further illustrate that our proposed framework achieves good operating characteristics as compared to commonly used methods in real DNA methylation data that aims to detect differential methylation sites. © 2012, The International Biometric Society.
Predictive inference for best linear combination of biomarkers subject to limits of detection.
Coolen-Maturi, Tahani
2017-08-15
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine, machine learning and credit scoring. The receiver operating characteristic (ROC) curve is a useful tool to assess the ability of a diagnostic test to discriminate between two classes or groups. In practice, multiple diagnostic tests or biomarkers are combined to improve diagnostic accuracy. Often, biomarker measurements are undetectable either below or above the so-called limits of detection (LoD). In this paper, nonparametric predictive inference (NPI) for best linear combination of two or more biomarkers subject to limits of detection is presented. NPI is a frequentist statistical method that is explicitly aimed at using few modelling assumptions, enabled through the use of lower and upper probabilities to quantify uncertainty. The NPI lower and upper bounds for the ROC curve subject to limits of detection are derived, where the objective function to maximize is the area under the ROC curve. In addition, the paper discusses the effect of restriction on the linear combination's coefficients on the analysis. Examples are provided to illustrate the proposed method. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Harris, L K; Whay, H R; Murrell, J C
2018-04-01
This study investigated the effects of osteoarthritis (OA) on somatosensory processing in dogs using mechanical threshold testing. A pressure algometer was used to measure mechanical thresholds in 27 dogs with presumed hind limb osteoarthritis and 28 healthy dogs. Mechanical thresholds were measured at the stifles, radii and sternum, and were correlated with scores from an owner questionnaire and a clinical checklist, a scoring system that quantified clinical signs of osteoarthritis. The effects of age and bodyweight on mechanical thresholds were also investigated. Multiple regression models indicated that, when bodyweight was taken into account, dogs with presumed osteoarthritis had lower mechanical thresholds at the stifles than control dogs, but not at other sites. Non-parametric correlations showed that clinical checklist scores and questionnaire scores were negatively correlated with mechanical thresholds at the stifles. The results suggest that mechanical threshold testing using a pressure algometer can detect primary, and possibly secondary, hyperalgesia in dogs with presumed osteoarthritis. This suggests that the mechanical threshold testing protocol used in this study might facilitate assessment of somatosensory changes associated with disease progression or response to treatment. Copyright © 2017. Published by Elsevier Ltd.
Li, Jun; Tibshirani, Robert
2015-01-01
We discuss the identification of features that are associated with an outcome in RNA-Sequencing (RNA-Seq) and other sequencing-based comparative genomic experiments. RNA-Seq data takes the form of counts, so models based on the normal distribution are generally unsuitable. The problem is especially challenging because different sequencing experiments may generate quite different total numbers of reads, or ‘sequencing depths’. Existing methods for this problem are based on Poisson or negative binomial models: they are useful but can be heavily influenced by ‘outliers’ in the data. We introduce a simple, nonparametric method with resampling to account for the different sequencing depths. The new method is more robust than parametric methods. It can be applied to data with quantitative, survival, two-class or multiple-class outcomes. We compare our proposed method to Poisson and negative binomial-based methods in simulated and real data sets, and find that our method discovers more consistent patterns than competing methods. PMID:22127579
Dugué, Audrey Emmanuelle; Pulido, Marina; Chabaud, Sylvie; Belin, Lisa; Gal, Jocelyn
2016-12-01
We describe how to estimate progression-free survival while dealing with interval-censored data in the setting of clinical trials in oncology. Three procedures with SAS and R statistical software are described: one allowing for a nonparametric maximum likelihood estimation of the survival curve using the EM-ICM (Expectation and Maximization-Iterative Convex Minorant) algorithm as described by Wellner and Zhan in 1997; a sensitivity analysis procedure in which the progression time is assigned (i) at the midpoint, (ii) at the upper limit (reflecting the standard analysis when the progression time is assigned at the first radiologic exam showing progressive disease), or (iii) at the lower limit of the censoring interval; and finally, two multiple imputations are described considering a uniform or the nonparametric maximum likelihood estimation (NPMLE) distribution. Clin Cancer Res; 22(23); 5629-35. ©2016 AACR. ©2016 American Association for Cancer Research.
Differences between h-index measures from different bibliographic sources and search engines.
Barreto, Mauricio Lima; Aragão, Erika; Sousa, Luis Eugenio Portela Fernandes de; Santana, Táris Maria; Barata, Rita Barradas
2013-04-01
To analyze the use of the h-index as a measure of the bibliometric impact of Brazilian researchers' scientific publications. The scientific production of Brazilian CNPq 1-A researchers in the areas of public health, immunology and medicine were compared. The mean h-index of the groups of researchers in each area were estimated and nonparametric Kruskal Wallis test and multiple comparisons Behrens-Fisher test were used to compare the differences. The h-index means were higher in the area of Immunology than in Public Health and Medicine when the Web of Science base was used. However, this difference disappears when the comparison is made using Scopus or Google Scholar. The emergence of Google Scholar brings a new level to discussions on the measure of the bibliometric impact of scientific publications. Areas with strong professional components, in which knowledge is produced and must also be published in the native language, vis-a-vis its dissemination to the international community, necessarily have a standard of scientific publications and citations different from areas exclusively or predominantly academic and they are best captured by Google Scholar.
Nicolas, Renaud; Sibon, Igor; Hiba, Bassem
2015-01-01
The diffusion-weighted-dependent attenuation of the MRI signal E(b) is extremely sensitive to microstructural features. The aim of this study was to determine which mathematical model of the E(b) signal most accurately describes it in the brain. The models compared were the monoexponential model, the stretched exponential model, the truncated cumulant expansion (TCE) model, the biexponential model, and the triexponential model. Acquisition was performed with nine b-values up to 2500 s/mm(2) in 12 healthy volunteers. The goodness-of-fit was studied with F-tests and with the Akaike information criterion. Tissue contrasts were differentiated with a multiple comparison corrected nonparametric analysis of variance. F-test showed that the TCE model was better than the biexponential model in gray and white matter. Corrected Akaike information criterion showed that the TCE model has the best accuracy and produced the most reliable contrasts in white matter among all models studied. In conclusion, the TCE model was found to be the best model to infer the microstructural properties of brain tissue.
Religiousness as a factor of hesitation against doping behavior in college-age athletes.
Zenic, Natasa; Stipic, Marija; Sekulic, Damir
2013-06-01
Religiousness is rarely studied as protective factor against substance use and misuse in sport. Further, we have found no investigation where college-age athletes were sampled and studied accordingly. The aim of the present study was to identify gender-specific protective effects of the religiousness (measured by Santa Clara Questionnaire) and other social, educational, and sport variables as a potential factors of hesitation against doping behaviors in sport-science-students from Mostar, Bosnia, and Herzegovina (51 women and 111 men; age range, 18-26). The gender differences for the non-parametric variables were established by Kruskall-Wallis test, while for the parametric variables the t-test for independent samples was used. Multiple regression calculations revealed religiousness as the most significant predictor of the social, health, sport and legal factors of hesitation against doping behaviors in both genders. However, the differential influence of the social, educational, sport and religious factors in relation to negative consequences of the doping behaviors is found for men and women. Such differential influence must be emphasized in tailoring the anti-doping policy and interventions.
A Nonparametric Approach to Automated S-Wave Picking
NASA Astrophysics Data System (ADS)
Rawles, C.; Thurber, C. H.
2014-12-01
Although a number of very effective P-wave automatic pickers have been developed over the years, automatic picking of S waves has remained more challenging. Most automatic pickers take a parametric approach, whereby some characteristic function (CF), e.g. polarization or kurtosis, is determined from the data and the pick is estimated from the CF. We have adopted a nonparametric approach, estimating the pick directly from the waveforms. For a particular waveform to be auto-picked, the method uses a combination of similarity to a set of seismograms with known S-wave arrivals and dissimilarity to a set of seismograms that do not contain S-wave arrivals. Significant effort has been made towards dealing with the problem of S-to-P conversions. We have evaluated the effectiveness of our method by testing it on multiple sets of microearthquake seismograms with well-determined S-wave arrivals for several areas around the world, including fault zones and volcanic regions. In general, we find that the results from our auto-picker are consistent with reviewed analyst picks 90% of the time at the 0.2 s level and 80% of the time at the 0.1 s level, or better. For most of the large datasets we have analyzed, our auto-picker also makes far more S-wave picks than were made previously by analysts. We are using these enlarged sets of high-quality S-wave picks to refine tomographic inversions for these areas, resulting in substantial improvement in the quality of the S-wave images. We will show examples from New Zealand, Hawaii, and California.
Wavelet Filtering to Reduce Conservatism in Aeroservoelastic Robust Stability Margins
NASA Technical Reports Server (NTRS)
Brenner, Marty; Lind, Rick
1998-01-01
Wavelet analysis for filtering and system identification was used to improve the estimation of aeroservoelastic stability margins. The conservatism of the robust stability margins was reduced with parametric and nonparametric time-frequency analysis of flight data in the model validation process. Nonparametric wavelet processing of data was used to reduce the effects of external desirableness and unmodeled dynamics. Parametric estimates of modal stability were also extracted using the wavelet transform. Computation of robust stability margins for stability boundary prediction depends on uncertainty descriptions derived from the data for model validation. F-18 high Alpha Research Vehicle aeroservoelastic flight test data demonstrated improved robust stability prediction by extension of the stability boundary beyond the flight regime.
The Importance of Practice in the Development of Statistics.
1983-01-01
RESOLUTION TEST CHART NATIONAL BUREAU OIF STANDARDS 1963 -A NRC Technical Summary Report #2471 C THE IMORTANCE OF PRACTICE IN to THE DEVELOPMENT OF STATISTICS...component analysis, bioassay, limits for a ratio, quality control, sampling inspection, non-parametric tests , transformation theory, ARIMA time series...models, sequential tests , cumulative sum charts, data analysis plotting techniques, and a resolution of the Bayes - frequentist controversy. It appears
Application of the LSQR algorithm in non-parametric estimation of aerosol size distribution
NASA Astrophysics Data System (ADS)
He, Zhenzong; Qi, Hong; Lew, Zhongyuan; Ruan, Liming; Tan, Heping; Luo, Kun
2016-05-01
Based on the Least Squares QR decomposition (LSQR) algorithm, the aerosol size distribution (ASD) is retrieved in non-parametric approach. The direct problem is solved by the Anomalous Diffraction Approximation (ADA) and the Lambert-Beer Law. An optimal wavelength selection method is developed to improve the retrieval accuracy of the ASD. The proposed optimal wavelength set is selected by the method which can make the measurement signals sensitive to wavelength and decrease the degree of the ill-condition of coefficient matrix of linear systems effectively to enhance the anti-interference ability of retrieval results. Two common kinds of monomodal and bimodal ASDs, log-normal (L-N) and Gamma distributions, are estimated, respectively. Numerical tests show that the LSQR algorithm can be successfully applied to retrieve the ASD with high stability in the presence of random noise and low susceptibility to the shape of distributions. Finally, the experimental measurement ASD over Harbin in China is recovered reasonably. All the results confirm that the LSQR algorithm combined with the optimal wavelength selection method is an effective and reliable technique in non-parametric estimation of ASD.
Multiple time scale analysis of sediment and runoff changes in the Lower Yellow River
NASA Astrophysics Data System (ADS)
Chi, Kaige; Gang, Zhao; Pang, Bo; Huang, Ziqian
2018-06-01
Sediment and runoff changes of seven hydrological stations along the Lower Yellow River (LYR) (Huayuankou Station, Jiahetan Station, Gaocun Station, Sunkou Station, Ai Shan Station, Qikou Station and Lijin Station) from 1980 to 2003 were alanyzed at multiple time scale. The maximum value of monthly, daily and hourly sediment load and runoff conservations were also analyzed with the annually mean value. Mann-Kendall non-parametric mathematics correlation test and Hurst coefficient method were adopted in the study. Research results indicate that (1) the runoff of seven hydrological stations was significantly reduced in the study period at different time scales. However, the trends of sediment load in these stations were not obvious. The sediment load of Huayuankou, Jiahetan and Aishan stations even slightly increased with the runoff decrease. (2) The trends of the sediment load with different time scale showed differences at Luokou and Lijin stations. Although the annually and monthly sediment load were broadly flat, the maximum hourly sediment load showed decrease trend. (3) According to the Hurst coefficients, the trend of sediment and runoff will be continue without taking measures, which proved the necessary of runoff-sediment regulation scheme.
ERIC Educational Resources Information Center
Zheng, Yinggan; Gierl, Mark J.; Cui, Ying
2010-01-01
This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…
ERIC Educational Resources Information Center
Oliveri, Maria Elena; Olson, Brent F.; Ercikan, Kadriye; Zumbo, Bruno D.
2012-01-01
In this study, the Canadian English and French versions of the Problem-Solving Measure of the Programme for International Student Assessment 2003 were examined to investigate their degree of measurement comparability at the item- and test-levels. Three methods of differential item functioning (DIF) were compared: parametric and nonparametric item…
Logistic Stick-Breaking Process
Ren, Lu; Du, Lan; Carin, Lawrence; Dunson, David B.
2013-01-01
A logistic stick-breaking process (LSBP) is proposed for non-parametric clustering of general spatially- or temporally-dependent data, imposing the belief that proximate data are more likely to be clustered together. The sticks in the LSBP are realized via multiple logistic regression functions, with shrinkage priors employed to favor contiguous and spatially localized segments. The LSBP is also extended for the simultaneous processing of multiple data sets, yielding a hierarchical logistic stick-breaking process (H-LSBP). The model parameters (atoms) within the H-LSBP are shared across the multiple learning tasks. Efficient variational Bayesian inference is derived, and comparisons are made to related techniques in the literature. Experimental analysis is performed for audio waveforms and images, and it is demonstrated that for segmentation applications the LSBP yields generally homogeneous segments with sharp boundaries. PMID:25258593
How to Evaluate Phase Differences between Trial Groups in Ongoing Electrophysiological Signals
VanRullen, Rufin
2016-01-01
A growing number of studies endeavor to reveal periodicities in sensory and cognitive functions, by comparing the distribution of ongoing (pre-stimulus) oscillatory phases between two (or more) trial groups reflecting distinct experimental outcomes. A systematic relation between the phase of spontaneous electrophysiological signals, before a stimulus is even presented, and the eventual result of sensory or cognitive processing for that stimulus, would be indicative of an intrinsic periodicity in the underlying neural process. Prior studies of phase-dependent perception have used a variety of analytical methods to measure and evaluate phase differences, and there is currently no established standard practice in this field. The present report intends to remediate this need, by systematically comparing the statistical power of various measures of “phase opposition” between two trial groups, in a number of real and simulated experimental situations. Seven measures were evaluated: one parametric test (circular Watson-Williams test), and three distinct measures of phase opposition (phase bifurcation index, phase opposition sum, and phase opposition product) combined with two procedures for non-parametric statistical testing (permutation, or a combination of z-score and permutation). While these are obviously not the only existing or conceivable measures, they have all been used in recent studies. All tested methods performed adequately on a previously published dataset (Busch et al., 2009). On a variety of artificially constructed datasets, no single measure was found to surpass all others, but instead the suitability of each measure was contingent on several experimental factors: the time, frequency, and depth of oscillatory phase modulation; the absolute and relative amplitudes of post-stimulus event-related potentials for the two trial groups; the absolute and relative trial numbers for the two groups; and the number of permutations used for non-parametric testing. The concurrent use of two phase opposition measures, the parametric Watson-Williams test and a non-parametric test based on summing inter-trial coherence values for the two trial groups, appears to provide the most satisfactory outcome in all situations tested. Matlab code is provided to automatically compute these phase opposition measures. PMID:27683543
Wieser, Stefan; Axmann, Markus; Schütz, Gerhard J.
2008-01-01
We propose here an approach for the analysis of single-molecule trajectories which is based on a comprehensive comparison of an experimental data set with multiple Monte Carlo simulations of the diffusion process. It allows quantitative data analysis, particularly whenever analytical treatment of a model is infeasible. Simulations are performed on a discrete parameter space and compared with the experimental results by a nonparametric statistical test. The method provides a matrix of p-values that assess the probability for having observed the experimental data at each setting of the model parameters. We show the testing approach for three typical situations observed in the cellular plasma membrane: i), free Brownian motion of the tracer, ii), hop diffusion of the tracer in a periodic meshwork of squares, and iii), transient binding of the tracer to slowly diffusing structures. By plotting the p-value as a function of the model parameters, one can easily identify the most consistent parameter settings but also recover mutual dependencies and ambiguities which are difficult to determine by standard fitting routines. Finally, we used the test to reanalyze previous data obtained on the diffusion of the glycosylphosphatidylinositol-protein CD59 in the plasma membrane of the human T24 cell line. PMID:18805933
de Lusignan, Simon; Kumarapeli, Pushpa; Chan, Tom; Pflug, Bernhard; van Vlymen, Jeremy; Jones, Beryl; Freeman, George K
2008-09-08
There is a lack of tools to evaluate and compare Electronic patient record (EPR) systems to inform a rational choice or development agenda. To develop a tool kit to measure the impact of different EPR system features on the consultation. We first developed a specification to overcome the limitations of existing methods. We divided this into work packages: (1) developing a method to display multichannel video of the consultation; (2) code and measure activities, including computer use and verbal interactions; (3) automate the capture of nonverbal interactions; (4) aggregate multiple observations into a single navigable output; and (5) produce an output interpretable by software developers. We piloted this method by filming live consultations (n = 22) by 4 general practitioners (GPs) using different EPR systems. We compared the time taken and variations during coded data entry, prescribing, and blood pressure (BP) recording. We used nonparametric tests to make statistical comparisons. We contrasted methods of BP recording using Unified Modeling Language (UML) sequence diagrams. We found that 4 channels of video were optimal. We identified an existing application for manual coding of video output. We developed in-house tools for capturing use of keyboard and mouse and to time stamp speech. The transcript is then typed within this time stamp. Although we managed to capture body language using pattern recognition software, we were unable to use this data quantitatively. We loaded these observational outputs into our aggregation tool, which allows simultaneous navigation and viewing of multiple files. This also creates a single exportable file in XML format, which we used to develop UML sequence diagrams. In our pilot, the GP using the EMIS LV (Egton Medical Information Systems Limited, Leeds, UK) system took the longest time to code data (mean 11.5 s, 95% CI 8.7-14.2). Nonparametric comparison of EMIS LV with the other systems showed a significant difference, with EMIS PCS (Egton Medical Information Systems Limited, Leeds, UK) (P = .007), iSoft Synergy (iSOFT, Banbury, UK) (P = .014), and INPS Vision (INPS, London, UK) (P = .006) facilitating faster coding. In contrast, prescribing was fastest with EMIS LV (mean 23.7 s, 95% CI 20.5-26.8), but nonparametric comparison showed no statistically significant difference. UML sequence diagrams showed that the simplest BP recording interface was not the easiest to use, as users spent longer navigating or looking up previous blood pressures separately. Complex interfaces with free-text boxes left clinicians unsure of what to add. The ALFA method allows the precise observation of the clinical consultation. It enables rigorous comparison of core elements of EPR systems. Pilot data suggests its capacity to demonstrate differences between systems. Its outputs could provide the evidence base for making more objective choices between systems.
Parametric vs. non-parametric daily weather generator: validation and comparison
NASA Astrophysics Data System (ADS)
Dubrovsky, Martin
2016-04-01
As the climate models (GCMs and RCMs) fail to satisfactorily reproduce the real-world surface weather regime, various statistical methods are applied to downscale GCM/RCM outputs into site-specific weather series. The stochastic weather generators are among the most favourite downscaling methods capable to produce realistic (observed like) meteorological inputs for agrological, hydrological and other impact models used in assessing sensitivity of various ecosystems to climate change/variability. To name their advantages, the generators may (i) produce arbitrarily long multi-variate synthetic weather series representing both present and changed climates (in the latter case, the generators are commonly modified by GCM/RCM-based climate change scenarios), (ii) be run in various time steps and for multiple weather variables (the generators reproduce the correlations among variables), (iii) be interpolated (and run also for sites where no weather data are available to calibrate the generator). This contribution will compare two stochastic daily weather generators in terms of their ability to reproduce various features of the daily weather series. M&Rfi is a parametric generator: Markov chain model is used to model precipitation occurrence, precipitation amount is modelled by the Gamma distribution, and the 1st order autoregressive model is used to generate non-precipitation surface weather variables. The non-parametric GoMeZ generator is based on the nearest neighbours resampling technique making no assumption on the distribution of the variables being generated. Various settings of both weather generators will be assumed in the present validation tests. The generators will be validated in terms of (a) extreme temperature and precipitation characteristics (annual and 30 years extremes and maxima of duration of hot/cold/dry/wet spells); (b) selected validation statistics developed within the frame of VALUE project. The tests will be based on observational weather series from several European stations available from the ECA&D database.
Martinez Manzanera, Octavio; Elting, Jan Willem; van der Hoeven, Johannes H.; Maurits, Natasha M.
2016-01-01
In the clinic, tremor is diagnosed during a time-limited process in which patients are observed and the characteristics of tremor are visually assessed. For some tremor disorders, a more detailed analysis of these characteristics is needed. Accelerometry and electromyography can be used to obtain a better insight into tremor. Typically, routine clinical assessment of accelerometry and electromyography data involves visual inspection by clinicians and occasionally computational analysis to obtain objective characteristics of tremor. However, for some tremor disorders these characteristics may be different during daily activity. This variability in presentation between the clinic and daily life makes a differential diagnosis more difficult. A long-term recording of tremor by accelerometry and/or electromyography in the home environment could help to give a better insight into the tremor disorder. However, an evaluation of such recordings using routine clinical standards would take too much time. We evaluated a range of techniques that automatically detect tremor segments in accelerometer data, as accelerometer data is more easily obtained in the home environment than electromyography data. Time can be saved if clinicians only have to evaluate the tremor characteristics of segments that have been automatically detected in longer daily activity recordings. We tested four non-parametric methods and five parametric methods on clinical accelerometer data from 14 patients with different tremor disorders. The consensus between two clinicians regarding the presence or absence of tremor on 3943 segments of accelerometer data was employed as reference. The nine methods were tested against this reference to identify their optimal parameters. Non-parametric methods generally performed better than parametric methods on our dataset when optimal parameters were used. However, one parametric method, employing the high frequency content of the tremor bandwidth under consideration (High Freq) performed similarly to non-parametric methods, but had the highest recall values, suggesting that this method could be employed for automatic tremor detection. PMID:27258018
Principals Judge Teachers by Their Teaching
ERIC Educational Resources Information Center
Nixon, Andy; Packard, Abbot; Dam, Margaret
2013-01-01
This quantitative study investigated the relationship between teacher dispositions, subject content knowledge, pedagogical content knowledge, and reasons that school principals recommend non-renewal of probationary teachers' contracts. Principals in the Southeastern Unites States completed an e-mailed survey. Two nonparametric tests,…
Randomization Procedures Applied to Analysis of Ballistic Data
1991-06-01
test,;;15. NUMBER OF PAGES data analysis; computationally intensive statistics ; randomization tests; permutation tests; 16 nonparametric statistics ...be 0.13. 8 Any reasonable statistical procedure would fail to support the notion of improvement of dynamic over standard indexing based on this data ...AD-A238 389 TECHNICAL REPORT BRL-TR-3245 iBRL RANDOMIZATION PROCEDURES APPLIED TO ANALYSIS OF BALLISTIC DATA MALCOLM S. TAYLOR BARRY A. BODT - JUNE
ERIC Educational Resources Information Center
Touron, Javier; Lizasoain, Luis; Joaristi, Luis
2012-01-01
The aim of this work is to analyze the dimensional structure of the Spanish version of the School and College Ability Test, employed in the process for the identification of students with high intellectual abilities. This test measures verbal and mathematical (or quantitative) abilities at three levels of difficulty: elementary (3rd, 4th, and 5th…
Statistical Package User’s Guide.
1980-08-01
261 C. STACH Nonparametric Descriptive Statistics ... ......... ... 265 D. CHIRA Coefficient of Concordance...135 I.- -a - - W 7- Test Data: This program was tested using data from John Neter and William Wasserman, Applied Linear Statistical Models: Regression...length of data file e. new fileý name (not same as raw data file) 5. Printout as optioned for only. Comments: Ranked data are used for program CHIRA
The Rasch Model and Missing Data, with an Emphasis on Tailoring Test Items.
ERIC Educational Resources Information Center
de Gruijter, Dato N. M.
Many applications of educational testing have a missing data aspect (MDA). This MDA is perhaps most pronounced in item banking, where each examinee responds to a different subtest of items from a large item pool and where both person and item parameter estimates are needed. The Rasch model is emphasized, and its non-parametric counterpart (the…
Cost-Aware Design of a Discrimination Strategy for Unexploded Ordnance Cleanup
2011-02-25
Acronyms ANN: Artificial Neural Network AUC: Area Under the Curve BRAC: Base Realignment And Closure DLRT: Distance Likelihood Ratio Test EER...Discriminative Aggregate Nonparametric [25] Artificial Neural Network ANN Discriminative Aggregate Parametric [33] 11 Results and Discussion Task #1
Nonparametric Bayesian Dictionary Learning for Analysis of Noisy and Incomplete Images
Zhou, Mingyuan; Chen, Haojun; Paisley, John; Ren, Lu; Li, Lingbo; Xing, Zhengming; Dunson, David; Sapiro, Guillermo; Carin, Lawrence
2013-01-01
Nonparametric Bayesian methods are considered for recovery of imagery based upon compressive, incomplete, and/or noisy measurements. A truncated beta-Bernoulli process is employed to infer an appropriate dictionary for the data under test and also for image recovery. In the context of compressive sensing, significant improvements in image recovery are manifested using learned dictionaries, relative to using standard orthonormal image expansions. The compressive-measurement projections are also optimized for the learned dictionary. Additionally, we consider simpler (incomplete) measurements, defined by measuring a subset of image pixels, uniformly selected at random. Spatial interrelationships within imagery are exploited through use of the Dirichlet and probit stick-breaking processes. Several example results are presented, with comparisons to other methods in the literature. PMID:21693421
Baila-Rueda, Lucía; Lamiquiz-Moneo, Itziar; Jarauta, Estíbaliz; Mateo-Gallego, Rocío; Perez-Calahorra, Sofía; Marco-Benedí, Victoria; Bea, Ana M; Cenarro, Ana; Civeira, Fernando
2018-01-15
Familial hypercholesterolemia (FH) is a genetic disorder that result in abnormally high low-density lipoprotein cholesterol levels, markedly increased risk of coronary heart disease (CHD) and tendon xanthomas (TX). However, the clinical expression is highly variable. TX are present in other metabolic diseases that associate increased sterol concentration. If non-cholesterol sterols are involved in the development of TX in FH has not been analyzed. Clinical and biochemical characteristics, non-cholesterol sterols concentrations and Aquilles tendon thickness were determined in subjects with genetic FH with (n = 63) and without (n = 40) TX. Student-t test o Mann-Whitney test were used accordingly. Categorical variables were compared using a Chi square test. ANOVA and Kruskal-Wallis tests were performed to multiple independent variables comparison. Post hoc adjusted comparisons were performed with Bonferroni correction when applicable. Correlations of parameters in selected groups were calculated applying the non-parametric Spearman correlation procedure. To identify variables associated with Achilles tendon thickness changes, multiple linear regression were applied. Patients with TX presented higher concentrations of non-cholesterol sterols in plasma than patients without xanthomas (P = 0.006 and 0.034, respectively). Furthermore, there was a significant association between 5α-cholestanol, β-sitosterol, desmosterol, 24S-hydroxycholesterol and 27-hydroxycholesterol concentrations and Achilles tendon thickness (p = 0.002, 0.012, 0.020, 0.045 and 0.040, respectively). Our results indicate that non-cholesterol sterol concentrations are associated with the presence of TX. Since cholesterol and non-cholesterol sterols are present in the same lipoproteins, further studies would be needed to elucidate their potential role in the development of TX.
Feng, Dai; Cortese, Giuliana; Baumgartner, Richard
2017-12-01
The receiver operating characteristic (ROC) curve is frequently used as a measure of accuracy of continuous markers in diagnostic tests. The area under the ROC curve (AUC) is arguably the most widely used summary index for the ROC curve. Although the small sample size scenario is common in medical tests, a comprehensive study of small sample size properties of various methods for the construction of the confidence/credible interval (CI) for the AUC has been by and large missing in the literature. In this paper, we describe and compare 29 non-parametric and parametric methods for the construction of the CI for the AUC when the number of available observations is small. The methods considered include not only those that have been widely adopted, but also those that have been less frequently mentioned or, to our knowledge, never applied to the AUC context. To compare different methods, we carried out a simulation study with data generated from binormal models with equal and unequal variances and from exponential models with various parameters and with equal and unequal small sample sizes. We found that the larger the true AUC value and the smaller the sample size, the larger the discrepancy among the results of different approaches. When the model is correctly specified, the parametric approaches tend to outperform the non-parametric ones. Moreover, in the non-parametric domain, we found that a method based on the Mann-Whitney statistic is in general superior to the others. We further elucidate potential issues and provide possible solutions to along with general guidance on the CI construction for the AUC when the sample size is small. Finally, we illustrate the utility of different methods through real life examples.
Apixaban decreases brain thrombin activity in a male mouse model of acute ischemic stroke.
Bushi, Doron; Chapman, Joab; Wohl, Anton; Stein, Efrat Shavit; Feingold, Ekaterina; Tanne, David
2018-05-14
Factor Xa (FXa) plays a critical role in the coagulation cascade by generation of thrombin. During focal ischemia thrombin levels increase in the brain tissue and cause neural damage. This study examined the hypothesis that administration of the FXa inhibitor, apixaban, following focal ischemic stroke may have therapeutic potential by decreasing brain thrombin activity and infarct volume. Male mice were divided into a treated groups that received different doses of apixaban (2, 20, 100 mg/kg administered I.P.) or saline (controls) immediately after blocking the middle cerebral artery (MCA). Thrombin activity was measured by a fluorescence assay on fresh coronal slices taken from the mice brains 24 hr following the MCA occlusion. Infarct volume was assessed using triphenyltetrazolium chloride staining. A high dose of apixaban (100 mg/kg) significantly decreased thrombin activity levels in the ipsilateral hemisphere compared to the control group (Slice#5, p = .016; Slice#6, p = .016; Slice#7, p = .016; Slice#8, p = .036; by the nonparametric Mann-Whitney test). In addition, treatment with apixaban doses of both 100 mg/kg (32 ± 8% vs. 76 ± 7% in the treatment vs. control groups respectively; p = .005 by the nonparametric Mann-Whitney test) and 20 mg/kg (43 ± 7% vs. 76 ± 7% in the treatment vs. control groups respectively; p = .019 by the nonparametric Mann-Whitney test) decreased infarct volumes in areas surrounding the ischemic core (Slices #3 and #8). No brain hemorrhages were observed either in the treated or control groups. In summary, I.P. administration of high dose of apixaban immediately after MCA occlusion decreases brain thrombin activity and reduces infarct size. © 2018 Wiley Periodicals, Inc.
Nonparametric test of consistency between cosmological models and multiband CMB measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aghamousa, Amir; Shafieloo, Arman, E-mail: amir@apctp.org, E-mail: shafieloo@kasi.re.kr
2015-06-01
We present a novel approach to test the consistency of the cosmological models with multiband CMB data using a nonparametric approach. In our analysis we calibrate the REACT (Risk Estimation and Adaptation after Coordinate Transformation) confidence levels associated with distances in function space (confidence distances) based on the Monte Carlo simulations in order to test the consistency of an assumed cosmological model with observation. To show the applicability of our algorithm, we confront Planck 2013 temperature data with concordance model of cosmology considering two different Planck spectra combination. In order to have an accurate quantitative statistical measure to compare betweenmore » the data and the theoretical expectations, we calibrate REACT confidence distances and perform a bias control using many realizations of the data. Our results in this work using Planck 2013 temperature data put the best fit ΛCDM model at 95% (∼ 2σ) confidence distance from the center of the nonparametric confidence set while repeating the analysis excluding the Planck 217 × 217 GHz spectrum data, the best fit ΛCDM model shifts to 70% (∼ 1σ) confidence distance. The most prominent features in the data deviating from the best fit ΛCDM model seems to be at low multipoles 18 < ℓ < 26 at greater than 2σ, ℓ ∼ 750 at ∼1 to 2σ and ℓ ∼ 1800 at greater than 2σ level. Excluding the 217×217 GHz spectrum the feature at ℓ ∼ 1800 becomes substantially less significance at ∼1 to 2σ confidence level. Results of our analysis based on the new approach we propose in this work are in agreement with other analysis done using alternative methods.« less
Complete genomic screen in Parkinson disease: evidence for multiple genes.
Scott, W K; Nance, M A; Watts, R L; Hubble, J P; Koller, W C; Lyons, K; Pahwa, R; Stern, M B; Colcher, A; Hiner, B C; Jankovic, J; Ondo, W G; Allen, F H; Goetz, C G; Small, G W; Masterman, D; Mastaglia, F; Laing, N G; Stajich, J M; Slotterbeck, B; Booze, M W; Ribble, R C; Rampersaud, E; West, S G; Gibson, R A; Middleton, L T; Roses, A D; Haines, J L; Scott, B L; Vance, J M; Pericak-Vance, M A
2001-11-14
The relative contribution of genes vs environment in idiopathic Parkinson disease (PD) is controversial. Although genetic studies have identified 2 genes in which mutations cause rare single-gene variants of PD and observational studies have suggested a genetic component, twin studies have suggested that little genetic contribution exists in the common forms of PD. To identify genetic risk factors for idiopathic PD. Genetic linkage study conducted 1995-2000 in which a complete genomic screen (n = 344 markers) was performed in 174 families with multiple individuals diagnosed as having idiopathic PD, identified through probands in 13 clinic populations in the continental United States and Australia. A total of 870 family members were studied: 378 diagnosed as having PD, 379 unaffected by PD, and 113 with unclear status. Logarithm of odds (lod) scores generated from parametric and nonparametric genetic linkage analysis. Two-point parametric maximum parametric lod score (MLOD) and multipoint nonparametric lod score (LOD) linkage analysis detected significant evidence for linkage to 5 distinct chromosomal regions: chromosome 6 in the parkin gene (MLOD = 5.07; LOD = 5.47) in families with at least 1 individual with PD onset at younger than 40 years, chromosomes 17q (MLOD = 2.28; LOD = 2.62), 8p (MLOD = 2.01; LOD = 2.22), and 5q (MLOD = 2.39; LOD = 1.50) overall and in families with late-onset PD, and chromosome 9q (MLOD = 1.52; LOD = 2.59) in families with both levodopa-responsive and levodopa-nonresponsive patients. Our data suggest that the parkin gene is important in early-onset PD and that multiple genetic factors may be important in the development of idiopathic late-onset PD.
Dutta, Sandipan; Datta, Somnath
2016-06-01
The Wilcoxon rank-sum test is a popular nonparametric test for comparing two independent populations (groups). In recent years, there have been renewed attempts in extending the Wilcoxon rank sum test for clustered data, one of which (Datta and Satten, 2005, Journal of the American Statistical Association 100, 908-915) addresses the issue of informative cluster size, i.e., when the outcomes and the cluster size are correlated. We are faced with a situation where the group specific marginal distribution in a cluster depends on the number of observations in that group (i.e., the intra-cluster group size). We develop a novel extension of the rank-sum test for handling this situation. We compare the performance of our test with the Datta-Satten test, as well as the naive Wilcoxon rank sum test. Using a naturally occurring simulation model of informative intra-cluster group size, we show that only our test maintains the correct size. We also compare our test with a classical signed rank test based on averages of the outcome values in each group paired by the cluster membership. While this test maintains the size, it has lower power than our test. Extensions to multiple group comparisons and the case of clusters not having samples from all groups are also discussed. We apply our test to determine whether there are differences in the attachment loss between the upper and lower teeth and between mesial and buccal sites of periodontal patients. © 2015, The International Biometric Society.
Type I Error Rates and Power Estimates of Selected Parametric and Nonparametric Tests of Scale.
ERIC Educational Resources Information Center
Olejnik, Stephen F.; Algina, James
1987-01-01
Estimated Type I Error rates and power are reported for the Brown-Forsythe, O'Brien, Klotz, and Siegal-Tukey procedures. The effect of aligning the data using deviations from group means or group medians is investigated. (RB)
The Distribution of the Sum of Signed Ranks
ERIC Educational Resources Information Center
Albright, Brian
2012-01-01
We describe the calculation of the distribution of the sum of signed ranks and develop an exact recursive algorithm for the distribution as well as an approximation of the distribution using the normal. The results have applications to the non-parametric Wilcoxon signed-rank test.
NASA Astrophysics Data System (ADS)
Houssein, Hend A. A.; Jaafar, M. S.; Ramli, R. M.; Ismail, N. E.; Ahmad, A. L.; Bermakai, M. Y.
2010-07-01
In this study, the subpopulations of human blood parameters including lymphocytes, the mid-cell fractions (eosinophils, basophils, and monocytes), and granulocytes were determined by electronic sizing in the Health Centre of Universiti Sains Malaysia. These parameters have been correlated with human blood characteristics such as age, gender, ethnicity, and blood types; before and after irradiation with 0.95 mW He-Ne laser (λ = 632.8 nm). The correlations were obtained by finding patterns, paired non-parametric tests, and an independent non-parametric tests using the SPSS version 11.5, centroid and peak positions, and flux variations. The findings show that the centroid and peak positions, flux peak and total flux, were very much correlated and can become a significant indicator for blood analyses. Furthermore, the encircled flux analysis demonstrated a good future prospect in blood research, thus leading the way as a vibrant diagnosis tool to clarify diseases associated with blood.
Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses
Callahan, Ben J.; Sankaran, Kris; Fukuyama, Julia A.; McMurdie, Paul J.; Holmes, Susan P.
2016-01-01
High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or OTU composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, whether parametric or nonparametric. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests and nonparametric testing using community networks and the ggnetwork package. PMID:27508062
Rank-based permutation approaches for non-parametric factorial designs.
Umlauft, Maria; Konietschke, Frank; Pauly, Markus
2017-11-01
Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal-Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. © 2017 The British Psychological Society.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
The linear transformation model with frailties for the analysis of item response times.
Wang, Chun; Chang, Hua-Hua; Douglas, Jeffrey A
2013-02-01
The item response times (RTs) collected from computerized testing represent an underutilized source of information about items and examinees. In addition to knowing the examinees' responses to each item, we can investigate the amount of time examinees spend on each item. In this paper, we propose a semi-parametric model for RTs, the linear transformation model with a latent speed covariate, which combines the flexibility of non-parametric modelling and the brevity as well as interpretability of parametric modelling. In this new model, the RTs, after some non-parametric monotone transformation, become a linear model with latent speed as covariate plus an error term. The distribution of the error term implicitly defines the relationship between the RT and examinees' latent speeds; whereas the non-parametric transformation is able to describe various shapes of RT distributions. The linear transformation model represents a rich family of models that includes the Cox proportional hazards model, the Box-Cox normal model, and many other models as special cases. This new model is embedded in a hierarchical framework so that both RTs and responses are modelled simultaneously. A two-stage estimation method is proposed. In the first stage, the Markov chain Monte Carlo method is employed to estimate the parametric part of the model. In the second stage, an estimating equation method with a recursive algorithm is adopted to estimate the non-parametric transformation. Applicability of the new model is demonstrated with a simulation study and a real data application. Finally, methods to evaluate the model fit are suggested. © 2012 The British Psychological Society.
Evaluation of non-stationarity of floods in the Northeastern and Upper Midwest United States
NASA Astrophysics Data System (ADS)
Dhakal, N.; Palmer, R. N.
2017-12-01
Climate change is likely to impact precipitation as well as snow accumulation and melt in the Northeastern and Upper Midwest Unites States, ultimately affecting the quantity and seasonal distribution of streamflow. Such information is crucial for flood protection polices for example for regional flood frequency analysis. The objective of this study is to analyze seasonality and magnitude of long-term daily annual maximum streamflow (AMF) records and its changes for 158 sites in Northeastern and Upper Midwest Unites States. Temporal trends were analyzed based on two 30-year blocks (1951-1980 and 1981-2010) of AMF. Seasonality is assessed based on nonparametric directional/circular statistical method that allows for an adaptive estimation of seasonal density. The results for temporal change in seasonality showed mixed pattern/trend across the stations. While for majority of the stations, the distribution of AMF timing is strongly unimodal (concentrated around Spring season) for the earlier time period, the strength in the modes have gotten weaker during the recent time period for a number of stations along the coastal states indicating the emergence of multiple modes and change in seasonality therein. Assessment of the temporal change in magnitude of AMF based on the Mann-Kendall nonparametric test shows that majority of the stations do not show significant increasing or decreasing trend for either time period. It is also observed that comparatively more stations show increasing trends in magnitude based on AMF from earlier time period and most of these stations are coastal sites concentrated in the southeastern part of the region. Our study focused on both seasonality and magnitude of AMF has important implications for flood management and mitigation.
Ryberg, Karen R.
2007-01-01
The Oakes Test Area is operated and maintained by the Garrison Diversion Conservancy District, under a cooperative agreement with the Bureau of Reclamation, to evaluate the effectiveness and environmental consequences of irrigation. As part of the evaluation, the Bureau of Reclamation collected water-quality samples from seven sites on the James River and the Oakes Test Area. The data were summarized and examined for trends in concentration. A nonparametric statistical test was used to test whether each concentration was increasing or decreasing with time for selected physical properties and constituents, and a trend slope was estimated for each constituent at each site. Trends were examined for two time periods, 1988-2004 and 1994-2004. Results varied by site and by constituent. All sites and all constituents tested had at least one statistically significant trend in the period 1988-2004. Sulfate, total dissolved solids, nitrate, and orthophosphate have significant positive trends at multiple sites with no significant negative trend at any site. Alkalinity and arsenic have single significant positive trends. Hardness, calcium, magnesium, sodium, sodium-adsorption ratio, potassium, and chloride have both significant positive and negative trends. Ammonia has a single significant negative trend. Fewer significant trends were identified in 1994-2004, and all but one were positive. The contribution to the James River from Oakes Test Area drainage appears to have little effect on water quality in the James River.
NASA Astrophysics Data System (ADS)
Qarib, Hossein; Adeli, Hojjat
2015-12-01
In this paper authors introduce a new adaptive signal processing technique for feature extraction and parameter estimation in noisy exponentially damped signals. The iterative 3-stage method is based on the adroit integration of the strengths of parametric and nonparametric methods such as multiple signal categorization, matrix pencil, and empirical mode decomposition algorithms. The first stage is a new adaptive filtration or noise removal scheme. The second stage is a hybrid parametric-nonparametric signal parameter estimation technique based on an output-only system identification technique. The third stage is optimization of estimated parameters using a combination of the primal-dual path-following interior point algorithm and genetic algorithm. The methodology is evaluated using a synthetic signal and a signal obtained experimentally from transverse vibrations of a steel cantilever beam. The method is successful in estimating the frequencies accurately. Further, it estimates the damping exponents. The proposed adaptive filtration method does not include any frequency domain manipulation. Consequently, the time domain signal is not affected as a result of frequency domain and inverse transformations.
A Non-parametric Cutout Index for Robust Evaluation of Identified Proteins*
Serang, Oliver; Paulo, Joao; Steen, Hanno; Steen, Judith A.
2013-01-01
This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems. PMID:23292186
Multi-object segmentation using coupled nonparametric shape and relative pose priors
NASA Astrophysics Data System (ADS)
Uzunbas, Mustafa Gökhan; Soldea, Octavian; Çetin, Müjdat; Ünal, Gözde; Erçil, Aytül; Unay, Devrim; Ekin, Ahmet; Firat, Zeynep
2009-02-01
We present a new method for multi-object segmentation in a maximum a posteriori estimation framework. Our method is motivated by the observation that neighboring or coupling objects in images generate configurations and co-dependencies which could potentially aid in segmentation if properly exploited. Our approach employs coupled shape and inter-shape pose priors that are computed using training images in a nonparametric multi-variate kernel density estimation framework. The coupled shape prior is obtained by estimating the joint shape distribution of multiple objects and the inter-shape pose priors are modeled via standard moments. Based on such statistical models, we formulate an optimization problem for segmentation, which we solve by an algorithm based on active contours. Our technique provides significant improvements in the segmentation of weakly contrasted objects in a number of applications. In particular for medical image analysis, we use our method to extract brain Basal Ganglia structures, which are members of a complex multi-object system posing a challenging segmentation problem. We also apply our technique to the problem of handwritten character segmentation. Finally, we use our method to segment cars in urban scenes.
Variability in clubhead presentation characteristics and ball impact location for golfers' drives.
Betzler, Nils F; Monk, Stuart A; Wallace, Eric S; Otto, Steve R
2012-01-01
The purpose of the present study was to analyse the variability in clubhead presentation to the ball and the resulting ball impact location on the club face for a range of golfers of different ability. A total of 285 male and female participants hit multiple shots using one of four proprietary drivers. Self-reported handicap was used to quantify a participant's golfing ability. A bespoke motion capture system and user-written algorithms was used to track the clubhead just before and at impact, measuring clubhead speed, clubhead orientation, and impact location. A Doppler radar was used to measure golf ball speed. Generally, golfers of higher skill (lower handicap) generated increased clubhead speed and increased efficiency (ratio of ball speed to clubhead speed). Non-parametric statistical tests showed that low-handicap golfers exhibit significantly lower variability from shot to shot in clubhead speed, efficiency, impact location, attack angle, club path, and face angle compared with high-handicap golfers.
Teaching Communication Skills to Medical and Pharmacy Students Through a Blended Learning Course
Hagemeier, Nicholas E.; Blackwelder, Reid; Rose, Daniel; Ansari, Nasar; Branham, Tandy
2016-01-01
Objective. To evaluate the impact of an interprofessional blended learning course on medical and pharmacy students’ patient-centered interpersonal communication skills and to compare precourse and postcourse communication skills across first-year medical and second-year pharmacy student cohorts. Methods. Students completed ten 1-hour online modules and participated in five 3-hour group sessions over one semester. Objective structured clinical examinations (OSCEs) were administered before and after the course and were evaluated using the validated Common Ground Instrument. Nonparametric statistical tests were used to examine pre/postcourse domain scores within and across professions. Results. Performance in all communication skill domains increased significantly for all students. No additional significant pre/postcourse differences were noted across disciplines. Conclusion. Students’ patient-centered interpersonal communication skills improved across multiple domains using a blended learning educational platform. Interview abilities were embodied similarly between medical and pharmacy students postcourse, suggesting both groups respond well to this form of instruction. PMID:27293231
Enabling a Comprehensive Teaching Strategy: Video Lectures
ERIC Educational Resources Information Center
Brecht, H. David; Ogilby, Suzanne M.
2008-01-01
This study empirically tests the feasibility and effectiveness of video lectures as a form of video instruction that enables a comprehensive teaching strategy used throughout a traditional classroom course. It examines student use patterns and the videos' effects on student learning, using qualitative and nonparametric statistical analyses of…
Treatment of Selective Mutism: A Best-Evidence Synthesis.
ERIC Educational Resources Information Center
Stone, Beth Pionek; Kratochwill, Thomas R.; Sladezcek, Ingrid; Serlin, Ronald C.
2002-01-01
Presents systematic analysis of the major treatment approaches used for selective mutism. Based on nonparametric statistical tests of effect sizes, major findings include the following: treatment of selective mutism is more effective than no treatment; behaviorally oriented treatment approaches are more effective than no treatment; and no…
NASA Astrophysics Data System (ADS)
Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.
2015-12-01
Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Nonparametric Estimation of the Probability of Ruin.
1985-02-01
MATHEMATICS RESEARCH CENTER I E N FREES FEB 85 MRC/TSR...in NONPARAMETRIC ESTIMATION OF THE PROBABILITY OF RUIN Lf Edward W. Frees * Mathematics Research Center University of Wisconsin-Madison 610 Walnut...34 - .. --- - • ’. - -:- - - ..- . . .- -- .-.-. . -. . .- •. . - . . - . . .’ . ’- - .. -’vi . .-" "-- -" ,’- UNIVERSITY OF WISCONSIN-MADISON MATHEMATICS RESEARCH CENTER NONPARAMETRIC ESTIMATION OF THE PROBABILITY
Marginally specified priors for non-parametric Bayesian estimation
Kessler, David C.; Hoff, Peter D.; Dunson, David B.
2014-01-01
Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
Health Insurance: The Trade-Off Between Risk Pooling and Moral Hazard.
1989-12-01
bias comes about because we suppress the intercept term in estimating VFor the power, the test is against 1, - 1. With this transform, the risk...dealing with the same utility function. As one test of whether families behave in the way economic theory suggests, we have also fitted a probit model of...nonparametric alternative to test our results’ sensitivity to the assumption of a normal error in both the theoretical and empirical models of the
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Evaluation of the flipped classroom approach in a veterinary professional skills course
Moffett, Jenny; Mill, Aileen C
2014-01-01
Background The flipped classroom is an educational approach that has had much recent coverage in the literature. Relatively few studies, however, use objective assessment of student performance to measure the impact of the flipped classroom on learning. The purpose of this study was to evaluate the use of a flipped classroom approach within a medical education setting to the first two levels of Kirkpatrick and Kirkpatrick’s effectiveness of training framework. Methods This study examined the use of a flipped classroom approach within a professional skills course offered to postgraduate veterinary students. A questionnaire was administered to two cohorts of students: those who had completed a traditional, lecture-based version of the course (Introduction to Veterinary Medicine [IVM]) and those who had completed a flipped classroom version (Veterinary Professional Foundations I [VPF I]). The academic performance of students within both cohorts was assessed using a set of multiple-choice items (n=24) nested within a written examination. Data obtained from the questionnaire were analyzed using Cronbach’s alpha, Kruskal–Wallis tests, and factor analysis. Data obtained from student performance in the written examination were analyzed using the nonparametric Wilcoxon rank sum test. Results A total of 133 IVM students and 64 VPF I students (n=197) agreed to take part in the study. Overall, study participants favored the flipped classroom approach over the traditional classroom approach. With respect to student academic performance, the traditional classroom students outperformed the flipped classroom students on a series of multiple-choice items (IVM mean =21.4±1.48 standard deviation; VPF I mean =20.25±2.20 standard deviation; Wilcoxon test, w=7,578; P<0.001). Conclusion This study demonstrates that learners seem to prefer a flipped classroom approach. The flipped classroom was rated more positively than the traditional classroom on many different characteristics. This preference, however, did not translate into improved student performance, as assessed by a series of multiple-choice items delivered during a written examination. PMID:25419164
Evaluation of the flipped classroom approach in a veterinary professional skills course.
Moffett, Jenny; Mill, Aileen C
2014-01-01
The flipped classroom is an educational approach that has had much recent coverage in the literature. Relatively few studies, however, use objective assessment of student performance to measure the impact of the flipped classroom on learning. The purpose of this study was to evaluate the use of a flipped classroom approach within a medical education setting to the first two levels of Kirkpatrick and Kirkpatrick's effectiveness of training framework. This study examined the use of a flipped classroom approach within a professional skills course offered to postgraduate veterinary students. A questionnaire was administered to two cohorts of students: those who had completed a traditional, lecture-based version of the course (Introduction to Veterinary Medicine [IVM]) and those who had completed a flipped classroom version (Veterinary Professional Foundations I [VPF I]). The academic performance of students within both cohorts was assessed using a set of multiple-choice items (n=24) nested within a written examination. Data obtained from the questionnaire were analyzed using Cronbach's alpha, Kruskal-Wallis tests, and factor analysis. Data obtained from student performance in the written examination were analyzed using the nonparametric Wilcoxon rank sum test. A total of 133 IVM students and 64 VPF I students (n=197) agreed to take part in the study. Overall, study participants favored the flipped classroom approach over the traditional classroom approach. With respect to student academic performance, the traditional classroom students outperformed the flipped classroom students on a series of multiple-choice items (IVM mean =21.4±1.48 standard deviation; VPF I mean =20.25±2.20 standard deviation; Wilcoxon test, w=7,578; P<0.001). This study demonstrates that learners seem to prefer a flipped classroom approach. The flipped classroom was rated more positively than the traditional classroom on many different characteristics. This preference, however, did not translate into improved student performance, as assessed by a series of multiple-choice items delivered during a written examination.
Testing for nonlinear dependence in financial markets.
Dore, Mohammed; Matilla-Garcia, Mariano; Marin, Manuel Ruiz
2011-07-01
This article addresses the question of improving the detection of nonlinear dependence by means of recently developed nonparametric tests. To this end a generalized version of BDS test and a new test based on symbolic dynamics are used on realizations from a well-known artificial market for which the dynamic equation governing the market is known. Comparisons with other tests for detecting nonlinearity are also provided. We show that the test based on symbolic dynamics outperforms other tests with the advantage that it depends only on one free parameter, namely the embedding dimension. This does not hold for other tests for nonlinearity.
Stennett, Andrea; De Souza, Lorraine; Norris, Meriel
2018-07-01
Exercise and physical activity have been found to be beneficial in managing disabilities caused by multiple sclerosis. Despite the known benefits, many people with multiple sclerosis are inactive. This study aimed to identify the prioritised exercise and physical activity practices of people with multiple sclerosis living in the community and the reasons why they are engaged in these activities. A four Round Delphi questionnaire scoped and determined consensus of priorities for the top 10 exercise and physical activities and the reasons why people with multiple sclerosis (n = 101) are engaged in these activities. Data were analysed using content analysis, descriptive statistics, and non-parametric tests. The top 10 exercise and physical activity practices and the top 10 reasons why people with multiple sclerosis (n = 70) engaged in these activities were identified and prioritised. Consensus was achieved for the exercise and physical activities (W = 0.744, p < .0001) and for the reasons they engaged in exercise and physical activity (W = 0.723, p < .0001). The exercise and physical activity practices and the reasons people with multiple sclerosis engaged in exercise and physical activity were diverse. These self-selected activities and reasons highlighted that people with multiple sclerosis might conceptualise exercise and physical activity in ways that may not be fully appreciated or understood by health professionals. Considerations of the views of people with multiple sclerosis may be essential if the goal of increasing physical activity in this population is to be achieved. Implications for Rehabilitation Health professionals should work collaboratively with people with multiple sclerosis to understand how they prioritise activities, the underlying reasons for their prioritisations and embed these into rehabilitation programmes. Health professionals should utilise activities prioritised by people with multiple sclerosis in the community as a way to support, promote, and sustain exercise and physical activity in this population. Rehabilitation interventions should include both the activities people with multiple sclerosis prioritise and the reasons why they engage in exercise and physical activity as another option for increasing physical activity levels and reducing sedentary behaviours.
Power Performance Verification of a Wind Farm Using the Friedman's Test.
Hernandez, Wilmar; López-Presa, José Luis; Maldonado-Correa, Jorge L
2016-06-03
In this paper, a method of verification of the power performance of a wind farm is presented. This method is based on the Friedman's test, which is a nonparametric statistical inference technique, and it uses the information that is collected by the SCADA system from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. Here, the guaranteed power curve of the wind turbines is used as one more wind turbine of the wind farm under assessment, and a multiple comparison method is used to investigate differences between pairs of wind turbines with respect to their power performance. The proposed method says whether the power performance of the specific wind farm under assessment differs significantly from what would be expected, and it also allows wind farm owners to know whether their wind farm has either a perfect power performance or an acceptable power performance. Finally, the power performance verification of an actual wind farm is carried out. The results of the application of the proposed method showed that the power performance of the specific wind farm under assessment was acceptable.
Wang, Qian; Xiao, Yong-fu; Vuitton, Dominique A; Schantz, Peter M; Raoul, Francis; Budke, Christine; Campos-Ponce, Maiza; Craig, Philip S; Giraudoux, Patrick
2007-02-05
Overgrazing was assumed to increase the population density of small mammals that are the intermediate hosts of Echinococcus multilocularis, the pathogen of alveolar echinococcosis in the Qinghai Tibet Plateau. This research tested the hypothesis that overgrazing might promote Echinococcus multilocularis transmission through increasing populations of small mammal, intermediate hosts in Tibetan pastoral communities. Grazing practices, small mammal indices and dog Echinococcus multilocularis infection data were collected to analyze the relation between overgrazing and Echinococcus multilocularis transmission using nonparametric tests and multiple stepwise logistic regression. In the investigated area, raising livestock was a key industry. The communal pastures existed and the available forage was deficient for grazing. Open (common) pastures were overgrazed and had higher burrow density of small mammals compared with neighboring fenced (private) pastures; this high overgrazing pressure on the open pastures measured by neighboring fenced area led to higher burrow density of small mammals in open pastures. The median burrow density of small mammals in open pastures was independently associated with nearby canine Echinococcus multilocularis infection (P = 0.003, OR = 1.048). Overgrazing may promote the transmission of Echinococcus multilocularis through increasing the population density of small mammals.
Power Performance Verification of a Wind Farm Using the Friedman’s Test
Hernandez, Wilmar; López-Presa, José Luis; Maldonado-Correa, Jorge L.
2016-01-01
In this paper, a method of verification of the power performance of a wind farm is presented. This method is based on the Friedman’s test, which is a nonparametric statistical inference technique, and it uses the information that is collected by the SCADA system from the sensors embedded in the wind turbines in order to carry out the power performance verification of a wind farm. Here, the guaranteed power curve of the wind turbines is used as one more wind turbine of the wind farm under assessment, and a multiple comparison method is used to investigate differences between pairs of wind turbines with respect to their power performance. The proposed method says whether the power performance of the specific wind farm under assessment differs significantly from what would be expected, and it also allows wind farm owners to know whether their wind farm has either a perfect power performance or an acceptable power performance. Finally, the power performance verification of an actual wind farm is carried out. The results of the application of the proposed method showed that the power performance of the specific wind farm under assessment was acceptable. PMID:27271628
The association of malocclusion and trumpet performance.
Kula, Katherine; Cilingir, H Zeynep; Eckert, George; Dagg, Jack; Ghoneima, Ahmed
2016-01-01
To determine whether trumpet performance skills are associated with malocclusion. Following institutional review board approval, 70 university trumpet students (54 male, 16 female; aged 20-38.9 years) were consented. After completing a survey, the students were evaluated while playing a scripted performance skills test (flexibility, articulation, range, and endurance exercises) on their instrument in a soundproof music practice room. One investigator (trumpet teacher) used a computerized metronome and a decibel meter during evaluation. A three-dimensional (3D) cone-beam computerized tomography scan (CBCT) was taken of each student the same day as the skills test. Following reliability studies, multiple dental parameters were measured on the 3D CBCT. Nonparametric correlations (Spearman), accepting P < .05 as significant, were used to determine if there were significant associations between dental parameters and the performance skills. Intrarater reliability was excellent (intraclass correlations; all r values > .94). Although associations were weak to moderate, significant negative associations (r ≤ -.32) were found between Little's irregularity index, interincisal inclination, maxillary central incisor rotation, and various flexibility and articulation performance skills, whereas significant positive associations (r ≤ .49) were found between arch widths and various skills. Specific malocclusions are associated with trumpet performance of experienced young musicians.
Using GIS to analyze animal movements in the marine environment
Hooge, Philip N.; Eichenlaub, William M.; Solomon, Elizabeth K.; Kruse, Gordon H.; Bez, Nicolas; Booth, Anthony; Dorn, Martin W.; Hills, Susan; Lipcius, Romuald N.; Pelletier, Dominique; Roy, Claude; Smith, Stephen J.; Witherell, David B.
2001-01-01
Advanced methods for analyzing animal movements have been little used in the aquatic research environment compared to the terrestrial. In addition, despite obvious advantages of integrating geographic information systems (GIS) with spatial studies of animal movement behavior, movement analysis tools have not been integrated into GIS for either aquatic or terrestrial environments. We therefore developed software that integrates one of the most commonly used GIS programs (ArcView®) with a large collection of animal movement analysis tools. This application, the Animal Movement Analyst Extension (AMAE), can be loaded as an extension to ArcView® under multiple operating system platforms (PC, Unix, and Mac OS). It contains more than 50 functions, including parametric and nonparametric home range analyses, random walk models, habitat analyses, point and circular statistics, tests of complete spatial randomness, tests for autocorrelation and sample size, point and line manipulation tools, and animation tools. This paper describes the use of these functions in analyzing animal location data; some limited examples are drawn from a sonic-tracking study of Pacific halibut (Hippoglossus stenolepis) in Glacier Bay, Alaska. The extension is available on the Internet at www.absc.usgs.gov/glba/gistools/index.htm.
Gan, Yandong; Wang, Lihong; Yang, Guiqiang; Dai, Jiulan; Wang, Renqing; Wang, Wenxing
2017-10-01
A field survey was conducted to investigate the concentrations of chromium (Cr), nickel (Ni), copper (Cu), zinc (Zn), cadmium (Cd) and lead (Pb) in vegetables, corresponding cultivated soils and irrigation waters from 36 open sites in high natural background area of Wuzhou, South China. Redundancy analysis, Spearman's rho correlation analysis and multiple regression analysis were adopted to evaluate the contributions of impacting factors on metal contents in the edible parts of vegetables. This study concluded that leafy and root vegetables had relatively higher metal concentrations and adjusted transfer factor values compared to fruiting vegetables according to nonparametric tests. Plant species, total soil metal content and soil pH value were affirmed as three critical factors with the highest contribution rate among all the influencing factors. The bivariate curve equation models for heavy metals in the edible vegetable tissues were well fitted to predict the metal concentrations in vegetables. The results from this case study also suggested that it could be one of efficient strategies for clean agricultural production and food safety in high natural background area to breed vegetable varieties with low heavy metal accumulation and to enlarge planting scale of these varieties. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dextromethorphan/Quinidine in Migraine Prophylaxis: An Open-label Observational Clinical Study.
Berkovich, Regina R; Sokolov, Alexey Y; Togasaki, Daniel M; Yakupova, Aida A; Cesar, Paul-Henry; Sahai-Srivastava, Soma
This study aimed to assess potential efficacy and safety of dextromethorphan/quinidine (DMQ) in prophylactic treatment of migraine in patients with multiple sclerosis (MS) with superimposed pseudobulbar affect (PBA). Multiple sclerosis patients with superimposed PBA and comorbid migraine were enrolled into this open-label observational study at the University of Southern California Comprehensive MS Center. The baseline characteristics included, among other data, frequency and severity of acute migraine attacks and use of migraine relievers. The DMQ was used exclusively per its primary indication - PBA symptoms control - 20/10 mg orally, twice a day for the mean of 4.5 months (the shortest exposure registered was 3 months and the longest, 6 months). To determine whether treatment caused an effect on migraine frequency and severity, the baseline and posttreatment values were compared using nonparametric sign test. Thirty-three MS subjects with PBA, who also suffered from migraines, were identified. Twenty-nine subjects had improvement in headache frequency, 4 had no change, and none had worsening (P < 0.001 as compared with the baseline). Twenty-eight subjects had improvement in headache severity, 5 had no change, and none had worsening (P < 0.001). Our pilot study results provide evidence that DMQ shows promise as a candidate for larger clinical studies evaluating its efficacy for the prevention of migraine headaches.
ERIC Educational Resources Information Center
Morgan, Daryle Whitney
To assess the status of welding in various manufacturing industries and to ascertain the occupational preparation needed for welding tradesmen, technicians, and technologists, completed questionnaires were obtained from 138 selected industrial specialists. The hypotheses were tested by Freidman's two-way nonparametric analysis of variance and by…
A Nonparametric Geostatistical Method For Estimating Species Importance
Andrew J. Lister; Rachel Riemann; Michael Hoppus
2001-01-01
Parametric statistical methods are not always appropriate for conducting spatial analyses of forest inventory data. Parametric geostatistical methods such as variography and kriging are essentially averaging procedures, and thus can be affected by extreme values. Furthermore, non normal distributions violate the assumptions of analyses in which test statistics are...
Do College Faculty Embrace Web 2.0 Technology?
ERIC Educational Resources Information Center
Siha, Samia M.; Bell, Reginald Lamar; Roebuck, Deborah
2016-01-01
The authors sought to determine if Rogers's Innovation Decision Process model could analyze Web 2.0 usage within the collegiate environment. The key independent variables studied in relationship to this model were gender, faculty rank, course content delivery method, and age. Chi-square nonparametric tests on the independent variables across…
A nonparametric clustering technique which estimates the number of clusters
NASA Technical Reports Server (NTRS)
Ramey, D. B.
1983-01-01
In applications of cluster analysis, one usually needs to determine the number of clusters, K, and the assignment of observations to each cluster. A clustering technique based on recursive application of a multivariate test of bimodality which automatically estimates both K and the cluster assignments is presented.
Least squares regression methods for clustered ROC data with discrete covariates.
Tang, Liansheng Larry; Zhang, Wei; Li, Qizhai; Ye, Xuan; Chan, Leighton
2016-07-01
The receiver operating characteristic (ROC) curve is a popular tool to evaluate and compare the accuracy of diagnostic tests to distinguish the diseased group from the nondiseased group when test results from tests are continuous or ordinal. A complicated data setting occurs when multiple tests are measured on abnormal and normal locations from the same subject and the measurements are clustered within the subject. Although least squares regression methods can be used for the estimation of ROC curve from correlated data, how to develop the least squares methods to estimate the ROC curve from the clustered data has not been studied. Also, the statistical properties of the least squares methods under the clustering setting are unknown. In this article, we develop the least squares ROC methods to allow the baseline and link functions to differ, and more importantly, to accommodate clustered data with discrete covariates. The methods can generate smooth ROC curves that satisfy the inherent continuous property of the true underlying curve. The least squares methods are shown to be more efficient than the existing nonparametric ROC methods under appropriate model assumptions in simulation studies. We apply the methods to a real example in the detection of glaucomatous deterioration. We also derive the asymptotic properties of the proposed methods. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
NASA Astrophysics Data System (ADS)
Meresescu, Alina G.; Kowalski, Matthieu; Schmidt, Frédéric; Landais, François
2018-06-01
The Water Residence Time distribution is the equivalent of the impulse response of a linear system allowing the propagation of water through a medium, e.g. the propagation of rain water from the top of the mountain towards the aquifers. We consider the output aquifer levels as the convolution between the input rain levels and the Water Residence Time, starting with an initial aquifer base level. The estimation of Water Residence Time is important for a better understanding of hydro-bio-geochemical processes and mixing properties of wetlands used as filters in ecological applications, as well as protecting fresh water sources for wells from pollutants. Common methods of estimating the Water Residence Time focus on cross-correlation, parameter fitting and non-parametric deconvolution methods. Here we propose a 1D full-deconvolution, regularized, non-parametric inverse problem algorithm that enforces smoothness and uses constraints of causality and positivity to estimate the Water Residence Time curve. Compared to Bayesian non-parametric deconvolution approaches, it has a fast runtime per test case; compared to the popular and fast cross-correlation method, it produces a more precise Water Residence Time curve even in the case of noisy measurements. The algorithm needs only one regularization parameter to balance between smoothness of the Water Residence Time and accuracy of the reconstruction. We propose an approach on how to automatically find a suitable value of the regularization parameter from the input data only. Tests on real data illustrate the potential of this method to analyze hydrological datasets.
Researches of fruit quality prediction model based on near infrared spectrum
NASA Astrophysics Data System (ADS)
Shen, Yulin; Li, Lian
2018-04-01
With the improvement in standards for food quality and safety, people pay more attention to the internal quality of fruits, therefore the measurement of fruit internal quality is increasingly imperative. In general, nondestructive soluble solid content (SSC) and total acid content (TAC) analysis of fruits is vital and effective for quality measurement in global fresh produce markets, so in this paper, we aim at establishing a novel fruit internal quality prediction model based on SSC and TAC for Near Infrared Spectrum. Firstly, the model of fruit quality prediction based on PCA + BP neural network, PCA + GRNN network, PCA + BP adaboost strong classifier, PCA + ELM and PCA + LS_SVM classifier are designed and implemented respectively; then, in the NSCT domain, the median filter and the SavitzkyGolay filter are used to preprocess the spectral signal, Kennard-Stone algorithm is used to automatically select the training samples and test samples; thirdly, we achieve the optimal models by comparing 15 kinds of prediction model based on the theory of multi-classifier competition mechanism, specifically, the non-parametric estimation is introduced to measure the effectiveness of proposed model, the reliability and variance of nonparametric estimation evaluation of each prediction model to evaluate the prediction result, while the estimated value and confidence interval regard as a reference, the experimental results demonstrate that this model can better achieve the optimal evaluation of the internal quality of fruit; finally, we employ cat swarm optimization to optimize two optimal models above obtained from nonparametric estimation, empirical testing indicates that the proposed method can provide more accurate and effective results than other forecasting methods.
Rodríguez-Álvarez, María Xosé; Roca-Pardiñas, Javier; Cadarso-Suárez, Carmen; Tahoces, Pablo G
2018-03-01
Prior to using a diagnostic test in a routine clinical setting, the rigorous evaluation of its diagnostic accuracy is essential. The receiver-operating characteristic curve is the measure of accuracy most widely used for continuous diagnostic tests. However, the possible impact of extra information about the patient (or even the environment) on diagnostic accuracy also needs to be assessed. In this paper, we focus on an estimator for the covariate-specific receiver-operating characteristic curve based on direct regression modelling and nonparametric smoothing techniques. This approach defines the class of generalised additive models for the receiver-operating characteristic curve. The main aim of the paper is to offer new inferential procedures for testing the effect of covariates on the conditional receiver-operating characteristic curve within the above-mentioned class. Specifically, two different bootstrap-based tests are suggested to check (a) the possible effect of continuous covariates on the receiver-operating characteristic curve and (b) the presence of factor-by-curve interaction terms. The validity of the proposed bootstrap-based procedures is supported by simulations. To facilitate the application of these new procedures in practice, an R-package, known as npROCRegression, is provided and briefly described. Finally, data derived from a computer-aided diagnostic system for the automatic detection of tumour masses in breast cancer is analysed.
2008-01-01
Background There is a lack of tools to evaluate and compare Electronic patient record (EPR) systems to inform a rational choice or development agenda. Objective To develop a tool kit to measure the impact of different EPR system features on the consultation. Methods We first developed a specification to overcome the limitations of existing methods. We divided this into work packages: (1) developing a method to display multichannel video of the consultation; (2) code and measure activities, including computer use and verbal interactions; (3) automate the capture of nonverbal interactions; (4) aggregate multiple observations into a single navigable output; and (5) produce an output interpretable by software developers. We piloted this method by filming live consultations (n = 22) by 4 general practitioners (GPs) using different EPR systems. We compared the time taken and variations during coded data entry, prescribing, and blood pressure (BP) recording. We used nonparametric tests to make statistical comparisons. We contrasted methods of BP recording using Unified Modeling Language (UML) sequence diagrams. Results We found that 4 channels of video were optimal. We identified an existing application for manual coding of video output. We developed in-house tools for capturing use of keyboard and mouse and to time stamp speech. The transcript is then typed within this time stamp. Although we managed to capture body language using pattern recognition software, we were unable to use this data quantitatively. We loaded these observational outputs into our aggregation tool, which allows simultaneous navigation and viewing of multiple files. This also creates a single exportable file in XML format, which we used to develop UML sequence diagrams. In our pilot, the GP using the EMIS LV (Egton Medical Information Systems Limited, Leeds, UK) system took the longest time to code data (mean 11.5 s, 95% CI 8.7-14.2). Nonparametric comparison of EMIS LV with the other systems showed a significant difference, with EMIS PCS (Egton Medical Information Systems Limited, Leeds, UK) (P = .007), iSoft Synergy (iSOFT, Banbury, UK) (P = .014), and INPS Vision (INPS, London, UK) (P = .006) facilitating faster coding. In contrast, prescribing was fastest with EMIS LV (mean 23.7 s, 95% CI 20.5-26.8), but nonparametric comparison showed no statistically significant difference. UML sequence diagrams showed that the simplest BP recording interface was not the easiest to use, as users spent longer navigating or looking up previous blood pressures separately. Complex interfaces with free-text boxes left clinicians unsure of what to add. Conclusions The ALFA method allows the precise observation of the clinical consultation. It enables rigorous comparison of core elements of EPR systems. Pilot data suggests its capacity to demonstrate differences between systems. Its outputs could provide the evidence base for making more objective choices between systems. PMID:18812313
A simple randomisation procedure for validating discriminant analysis: a methodological note.
Wastell, D G
1987-04-01
Because the goal of discriminant analysis (DA) is to optimise classification, it designedly exaggerates between-group differences. This bias complicates validation of DA. Jack-knifing has been used for validation but is inappropriate when stepwise selection (SWDA) is employed. A simple randomisation test is presented which is shown to give correct decisions for SWDA. The general superiority of randomisation tests over orthodox significance tests is discussed. Current work on non-parametric methods of estimating the error rates of prediction rules is briefly reviewed.
Yelnik, A P; Tasseel Ponche, S; Andriantsifanetra, C; Provost, C; Calvalido, A; Rougier, P
2015-12-01
The Romberg test, with the subject standing and with eyes closed, gives diagnostic arguments for a proprioceptive disorder. Closing the eyes is also used in balance rehabilitation as a main way to stimulate neural plasticity with proprioceptive, vestibular and even cerebellar disorders. Nevertheless, standing and walking with eyes closed or with eyes open in the dark are certainly 2 different tasks. We aimed to compare walking with eyes open, closed and wearing black or white goggles in healthy subjects. A total of 50 healthy participants were randomly divided into 2 protocols and asked to walk on a 5-m pressure-sensitive mat, under 3 conditions: (1) eyes open (EO), eyes closed (EC) and eyes open with black goggles (BG) and (2) EO, EO with BG and with white goggles (WG). Gait was described by velocity (m·s(-1)), double support (% gait cycle), gait variability index (GVI/100) and exit from the mat (%). Analysis involved repeated measures Anova, Holm-Sidak's multiple comparisons test for parametric parameters (GVI) and Dunn's multiple comparisons test for non-parametric parameters. As compared with walking with EC, walking with BG produced lower median velocity, by 6% (EO 1.26; BG 1.01 vs EC 1.07 m·s(-1), P=0.0328), and lower mean GVI, by 8% (EO 91.8; BG 66.8 vs EC 72.24, P=0.009). Parameters did not differ between walking under the BG and WG conditions. The goggle task increases the difficulty in walking with visual deprivation compared to the Romberg task, so the goggle task can be proposed to gradually increase the difficulty in walking with visual deprivation (from eyes closed to eyes open in black goggles). Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Rediscovery of Good-Turing estimators via Bayesian nonparametrics.
Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye
2016-03-01
The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.
Sharma, Andy
2017-06-01
The purpose of this study was to showcase an advanced methodological approach to model disability and institutional entry. Both of these are important areas to investigate given the on-going aging of the United States population. By 2020, approximately 15% of the population will be 65 years and older. Many of these older adults will experience disability and require formal care. A probit analysis was employed to determine which disabilities were associated with admission into an institution (i.e. long-term care). Since this framework imposes strong distributional assumptions, misspecification leads to inconsistent estimators. To overcome such a short-coming, this analysis extended the probit framework by employing an advanced semi-nonparamertic maximum likelihood estimation utilizing Hermite polynomial expansions. Specification tests show semi-nonparametric estimation is preferred over probit. In terms of the estimates, semi-nonparametric ratios equal 42 for cognitive difficulty, 64 for independent living, and 111 for self-care disability while probit yields much smaller estimates of 19, 30, and 44, respectively. Public health professionals can use these results to better understand why certain interventions have not shown promise. Equally important, healthcare workers can use this research to evaluate which type of treatment plans may delay institutionalization and improve the quality of life for older adults. Implications for rehabilitation With on-going global aging, understanding the association between disability and institutional entry is important in devising successful rehabilitation interventions. Semi-nonparametric is preferred to probit and shows ambulatory and cognitive impairments present high risk for institutional entry (long-term care). Informal caregiving and home-based care require further examination as forms of rehabilitation/therapy for certain types of disabilities.
Tree-Structured Infinite Sparse Factor Model
Zhang, XianXing; Dunson, David B.; Carin, Lawrence
2013-01-01
A tree-structured multiplicative gamma process (TMGP) is developed, for inferring the depth of a tree-based factor-analysis model. This new model is coupled with the nested Chinese restaurant process, to nonparametrically infer the depth and width (structure) of the tree. In addition to developing the model, theoretical properties of the TMGP are addressed, and a novel MCMC sampler is developed. The structure of the inferred tree is used to learn relationships between high-dimensional data, and the model is also applied to compressive sensing and interpolation of incomplete images. PMID:25279389
Multiple robustness in factorized likelihood models.
Molina, J; Rotnitzky, A; Sued, M; Robins, J M
2017-09-01
We consider inference under a nonparametric or semiparametric model with likelihood that factorizes as the product of two or more variation-independent factors. We are interested in a finite-dimensional parameter that depends on only one of the likelihood factors and whose estimation requires the auxiliary estimation of one or several nuisance functions. We investigate general structures conducive to the construction of so-called multiply robust estimating functions, whose computation requires postulating several dimension-reducing models but which have mean zero at the true parameter value provided one of these models is correct.
NASA Astrophysics Data System (ADS)
Fernández-Llamazares, Álvaro; Belmonte, Jordina; Delgado, Rosario; De Linares, Concepción
2014-04-01
Airborne pollen records are a suitable indicator for the study of climate change. The present work focuses on the role of annual pollen indices for the detection of bioclimatic trends through the analysis of the aerobiological spectra of 11 taxa of great biogeographical relevance in Catalonia over an 18-year period (1994-2011), by means of different parametric and non-parametric statistical methods. Among others, two non-parametric rank-based statistical tests were performed for detecting monotonic trends in time series data of the selected airborne pollen types and we have observed that they have similar power in detecting trends. Except for those cases in which the pollen data can be well-modeled by a normal distribution, it is better to apply non-parametric statistical methods to aerobiological studies. Our results provide a reliable representation of the pollen trends in the region and suggest that greater pollen quantities are being liberated to the atmosphere in the last years, specially by Mediterranean taxa such as Pinus, Total Quercus and Evergreen Quercus, although the trends may differ geographically. Longer aerobiological monitoring periods are required to corroborate these results and survey the increasing levels of certain pollen types that could exert an impact in terms of public health.
Agarwal, Rahul; Chen, Zhe; Kloosterman, Fabian; Wilson, Matthew A; Sarma, Sridevi V
2016-07-01
Pyramidal neurons recorded from the rat hippocampus and entorhinal cortex, such as place and grid cells, have diverse receptive fields, which are either unimodal or multimodal. Spiking activity from these cells encodes information about the spatial position of a freely foraging rat. At fine timescales, a neuron's spike activity also depends significantly on its own spike history. However, due to limitations of current parametric modeling approaches, it remains a challenge to estimate complex, multimodal neuronal receptive fields while incorporating spike history dependence. Furthermore, efforts to decode the rat's trajectory in one- or two-dimensional space from hippocampal ensemble spiking activity have mainly focused on spike history-independent neuronal encoding models. In this letter, we address these two important issues by extending a recently introduced nonparametric neural encoding framework that allows modeling both complex spatial receptive fields and spike history dependencies. Using this extended nonparametric approach, we develop novel algorithms for decoding a rat's trajectory based on recordings of hippocampal place cells and entorhinal grid cells. Results show that both encoding and decoding models derived from our new method performed significantly better than state-of-the-art encoding and decoding models on 6 minutes of test data. In addition, our model's performance remains invariant to the apparent modality of the neuron's receptive field.
Vilella, Karina Duarte; Fraiz, Fabian Calixto; Benelli, Elaine Machado; Assunção, Luciana Reichert da Silva
This study evaluated the effect of oral health literacy (OHL) on the retention of health information in pregnant women. A total of 175 pregnant women were randomly assigned to standard oral (spoken), written and control intervention groups. With the exception of the control group, the interventions investigated the eating habits and oral hygiene among children under 2 years of age. The participants' answers before the interventions (pre-test), 15 min after the interventions (post-test) and 4 weeks after the interventions (follow-up test) were used to estimate the knowledge score (KS). Information acquisition was determined by comparing pre-test and post-test results, while retention of information was based comparing pre-test and follow-up test results. OHL was analysed by BREALD-30. The data were assessed by nonparametric tests and Poisson regression models with robust variance (α = 0.05). By the end of the follow-up period, 162 pregnant women had been assessed. The BREALD-30 mean was 22.3 (SD = 4.80). Regardless of the type of intervention, pregnant women with low OHL had lower knowledge scores in the three assessments. Participants with low OHL showed higher acquisition and retention of information in the standard oral health intervention. Multiple regression models demonstrated that OHL was independently associated with KS, age, socioeconomic status and type of intervention. The results suggest a negative effect of low OHL on retention of information. Only the standard, spoken oral health intervention could address the differences in literacy levels.
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
Measuring firm size distribution with semi-nonparametric densities
NASA Astrophysics Data System (ADS)
Cortés, Lina M.; Mora-Valencia, Andrés; Perote, Javier
2017-11-01
In this article, we propose a new methodology based on a (log) semi-nonparametric (log-SNP) distribution that nests the lognormal and enables better fits in the upper tail of the distribution through the introduction of new parameters. We test the performance of the lognormal and log-SNP distributions capturing firm size, measured through a sample of US firms in 2004-2015. Taking different levels of aggregation by type of economic activity, our study shows that the log-SNP provides a better fit of the firm size distribution. We also formally introduce the multivariate log-SNP distribution, which encompasses the multivariate lognormal, to analyze the estimation of the joint distribution of the value of the firm's assets and sales. The results suggest that sales are a better firm size measure, as indicated by other studies in the literature.
Bayesian isotonic density regression
Wang, Lianming; Dunson, David B.
2011-01-01
Density regression models allow the conditional distribution of the response given predictors to change flexibly over the predictor space. Such models are much more flexible than nonparametric mean regression models with nonparametric residual distributions, and are well supported in many applications. A rich variety of Bayesian methods have been proposed for density regression, but it is not clear whether such priors have full support so that any true data-generating model can be accurately approximated. This article develops a new class of density regression models that incorporate stochastic-ordering constraints which are natural when a response tends to increase or decrease monotonely with a predictor. Theory is developed showing large support. Methods are developed for hypothesis testing, with posterior computation relying on a simple Gibbs sampler. Frequentist properties are illustrated in a simulation study, and an epidemiology application is considered. PMID:22822259
Machine Learning Based Evaluation of Reading and Writing Difficulties.
Iwabuchi, Mamoru; Hirabayashi, Rumi; Nakamura, Kenryu; Dim, Nem Khan
2017-01-01
The possibility of auto evaluation of reading and writing difficulties was investigated using non-parametric machine learning (ML) regression technique for URAWSS (Understanding Reading and Writing Skills of Schoolchildren) [1] test data of 168 children of grade 1 - 9. The result showed that the ML had better prediction than the ordinary rule-based decision.
The Effects of Scenario Planning on Participant Decision-Making Style
ERIC Educational Resources Information Center
Chermack, Thomas J.; Nimon, Kim
2008-01-01
This research examines changes in decision-making styles as a result of participation in scenario planning. A quasi-experimental pretest-posttest design and several nonparametric tests were used to analyze data gathered from research participants in a technology firm in the Northeastern United States. Results show that participants tend to…
Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science
ERIC Educational Resources Information Center
Ju, Boryung; Jin, Tao
2013-01-01
Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…
Learning Patterns as Criterion for Forming Work Groups in 3D Simulation Learning Environments
ERIC Educational Resources Information Center
Maria Cela-Ranilla, Jose; Molías, Luis Marqués; Cervera, Mercè Gisbert
2016-01-01
This study analyzes the relationship between the use of learning patterns as a grouping criterion to develop learning activities in the 3D simulation environment at University. Participants included 72 Spanish students from the Education and Marketing disciplines. Descriptive statistics and non-parametric tests were conducted. The process was…
Does race matter in landowners' participation in conservation incentive programs?
Jianbang Gan; Okwuldili O. Onianwa; John Schelhas; Gerald C. Wheelock; Mark R. Dubois
2005-01-01
This study investigated and compared the participation behavior of white and minority small landowners in Alabama in eight conservation incentive programs. Using nonparametric tests and logit modeling, we found both similarities and differences in participation behavior between these two landowner groups. Both white and minority landowners tended not to participate in...
A Nonparametric Approach to Estimate Classification Accuracy and Consistency
ERIC Educational Resources Information Center
Lathrop, Quinn N.; Cheng, Ying
2014-01-01
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA…
ERIC Educational Resources Information Center
Lee, Young-Sun; Wollack, James A.; Douglas, Jeffrey
2009-01-01
The purpose of this study was to assess the model fit of a 2PL through comparison with the nonparametric item characteristic curve (ICC) estimation procedures. Results indicate that three nonparametric procedures implemented produced ICCs that are similar to that of the 2PL for items simulated to fit the 2PL. However for misfitting items,…
Detection of coupling delay: A problem not yet solved
NASA Astrophysics Data System (ADS)
Coufal, David; Jakubík, Jozef; Jajcay, Nikola; Hlinka, Jaroslav; Krakovská, Anna; Paluš, Milan
2017-08-01
Nonparametric detection of coupling delay in unidirectionally and bidirectionally coupled nonlinear dynamical systems is examined. Both continuous and discrete-time systems are considered. Two methods of detection are assessed—the method based on conditional mutual information—the CMI method (also known as the transfer entropy method) and the method of convergent cross mapping—the CCM method. Computer simulations show that neither method is generally reliable in the detection of coupling delays. For continuous-time chaotic systems, the CMI method appears to be more sensitive and applicable in a broader range of coupling parameters than the CCM method. In the case of tested discrete-time dynamical systems, the CCM method has been found to be more sensitive, while the CMI method required much stronger coupling strength in order to bring correct results. However, when studied systems contain a strong oscillatory component in their dynamics, results of both methods become ambiguous. The presented study suggests that results of the tested algorithms should be interpreted with utmost care and the nonparametric detection of coupling delay, in general, is a problem not yet solved.
Ye, Xin; Pendyala, Ram M.; Zou, Yajie
2017-01-01
A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences. PMID:29073152
Wang, Ke; Ye, Xin; Pendyala, Ram M; Zou, Yajie
2017-01-01
A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences.
Mathematical models for nonparametric inferences from line transect data
Burnham, K.P.; Anderson, D.R.
1976-01-01
A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(O) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y/r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(O/r).
Kharroubi, Samer A; Brazier, John E; McGhee, Sarah
2014-06-01
There is interest in the extent to which valuations of health may differ between different countries and cultures, but few studies have compared preference values of health states obtained in different countries. The present study applies a nonparametric model to estimate and compare two HK and UK standard gamble values for six-dimensional health state short form (derived from short-form 36 health survey) (SF-6D) health states using Bayesian methods. The data set is the HK and UK SF-6D valuation studies in which two samples of 197 and 249 states defined by the SF-6D were valued by representative samples of the HK and UK general populations, respectively, both using the standard gamble technique. We estimated a function applicable across both countries that explicitly accounts for the differences between them, and is estimated using the data from both countries. The results suggest that differences in SF-6D health state valuations between the UK and HK general populations are potentially important. In particular, the valuations of Hong Kong were meaningfully higher than those of the United Kingdom for most of the selected SF-6D health states. The magnitude of these country-specific differences in health state valuation depended, however, in a complex way on the levels of individual dimensions. The new Bayesian nonparametric method is a powerful approach for analyzing data from multiple nationalities or ethnic groups to understand the differences between them and potentially to estimate the underlying utility functions more efficiently. Copyright © 2014 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Yu, Wenbao; Park, Taesung
2014-01-01
It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes. We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.
Maity, Arnab; Carroll, Raymond J; Mammen, Enno; Chatterjee, Nilanjan
2009-01-01
Motivated from the problem of testing for genetic effects on complex traits in the presence of gene-environment interaction, we develop score tests in general semiparametric regression problems that involves Tukey style 1 degree-of-freedom form of interaction between parametrically and non-parametrically modelled covariates. We find that the score test in this type of model, as recently developed by Chatterjee and co-workers in the fully parametric setting, is biased and requires undersmoothing to be valid in the presence of non-parametric components. Moreover, in the presence of repeated outcomes, the asymptotic distribution of the score test depends on the estimation of functions which are defined as solutions of integral equations, making implementation difficult and computationally taxing. We develop profiled score statistics which are unbiased and asymptotically efficient and can be performed by using standard bandwidth selection methods. In addition, to overcome the difficulty of solving functional equations, we give easy interpretations of the target functions, which in turn allow us to develop estimation procedures that can be easily implemented by using standard computational methods. We present simulation studies to evaluate type I error and power of the method proposed compared with a naive test that does not consider interaction. Finally, we illustrate our methodology by analysing data from a case-control study of colorectal adenoma that was designed to investigate the association between colorectal adenoma and the candidate gene NAT2 in relation to smoking history.
A nonparametric test for Markovianity in the illness-death model.
Rodríguez-Girondo, Mar; de Uña-Álvarez, Jacobo
2012-12-30
Multistate models are useful tools for modeling disease progression when survival is the main outcome, but several intermediate events of interest are observed during the follow-up time. The illness-death model is a special multistate model with important applications in the biomedical literature. It provides a suitable representation of the individual's history when a unique intermediate event can be experienced before the main event of interest. Nonparametric estimation of transition probabilities in this and other multistate models is usually performed through the Aalen-Johansen estimator under a Markov assumption. The Markov assumption claims that given the present state, the future evolution of the illness is independent of the states previously visited and the transition times among them. However, this assumption fails in some applications, leading to inconsistent estimates. In this paper, we provide a new approach for testing Markovianity in the illness-death model. The new method is based on measuring the future-past association along time. This results in a detailed inspection of the process, which often reveals a non-Markovian behavior with different trends in the association measure. A test of significance for zero future-past association at each time point is introduced, and a significance trace is proposed accordingly. Besides, we propose a global test for Markovianity based on a supremum-type test statistic. The finite sample performance of the test is investigated through simulations. We illustrate the new method through the analysis of two biomedical data analysis. Copyright © 2012 John Wiley & Sons, Ltd.
Lal, Devyani; Keim, Paul; Delisle, Josie; Barker, Bridget; Rank, Matthew A; Chia, Nicholas; Schupp, James M; Gillece, John D; Cope, Emily K
2017-06-01
The role of microbiota in sinonasal inflammation can be further understood by targeted sampling of healthy and diseased subjects. We compared the microbiota of the middle meatus (MM) and inferior meatus (IM) in healthy, allergic rhinitis (AR), and chronic rhinosinusitis (CRS) subjects to characterize intrasubject, intersubject, and intergroup differences. Subjects were recruited in the office, and characterized into healthy, AR, and CRS groups. Endoscopically-guided swab samples were obtained from the MM and IM bilaterally. Bacterial microbiota were characterized by sequencing the V3-V4 region of the 16S ribosomal RNA (rRNA) gene. Intersubject microbiome analyses were conducted in 65 subjects: 8 healthy, 11 AR, and 46 CRS (25 CRS with nasal polyps [CRSwNP]; 21 CRS without nasal polyps [CRSsNP]). Intrasubject analyses were conducted for 48 individuals (4 controls, 11 AR, 8 CRSwNP, and 15 CRSwNP). There was considerable intersubject microbiota variability, but intrasubject profiles were similar (p = 0.001, nonparametric t test). Intrasubject bacterial diversity was significantly reduced in MM of CRSsNP subjects compared to IM samples (p = 0.022, nonparametric t test). CRSsNP MM samples were enriched in Streptococcus, Haemophilus, and Fusobacterium spp. but exhibited loss of diversity compared to healthy, CRSwNP, and AR subject-samples (p < 0.05; nonparametric t test). CRSwNP patients were enriched in Staphylococcus, Alloiococcus, and Corynebacterium spp. This study presents the sinonasal microbiome profile in one of the larger populations of non-CRS and CRS subjects, and is the first office-based cohort in the literature. In contrast to healthy, AR, and CRSwNP subjects, CRSsNP MM samples exhibited decreased microbiome diversity and anaerobic enrichment. CRSsNP MM samples had reduced diversity compared to same-subject IM samples, a novel finding. © 2017 ARS-AAOA, LLC.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property
Storlie, Curtis B.; Bondell, Howard D.; Reich, Brian J.; Zhang, Hao Helen
2010-01-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting. PMID:21603586
Validation of two (parametric vs non-parametric) daily weather generators
NASA Astrophysics Data System (ADS)
Dubrovsky, M.; Skalak, P.
2015-12-01
As the climate models (GCMs and RCMs) fail to satisfactorily reproduce the real-world surface weather regime, various statistical methods are applied to downscale GCM/RCM outputs into site-specific weather series. The stochastic weather generators are among the most favourite downscaling methods capable to produce realistic (observed-like) meteorological inputs for agrological, hydrological and other impact models used in assessing sensitivity of various ecosystems to climate change/variability. To name their advantages, the generators may (i) produce arbitrarily long multi-variate synthetic weather series representing both present and changed climates (in the latter case, the generators are commonly modified by GCM/RCM-based climate change scenarios), (ii) be run in various time steps and for multiple weather variables (the generators reproduce the correlations among variables), (iii) be interpolated (and run also for sites where no weather data are available to calibrate the generator). This contribution will compare two stochastic daily weather generators in terms of their ability to reproduce various features of the daily weather series. M&Rfi is a parametric generator: Markov chain model is used to model precipitation occurrence, precipitation amount is modelled by the Gamma distribution, and the 1st order autoregressive model is used to generate non-precipitation surface weather variables. The non-parametric GoMeZ generator is based on the nearest neighbours resampling technique making no assumption on the distribution of the variables being generated. Various settings of both weather generators will be assumed in the present validation tests. The generators will be validated in terms of (a) extreme temperature and precipitation characteristics (annual and 30-years extremes and maxima of duration of hot/cold/dry/wet spells); (b) selected validation statistics developed within the frame of VALUE project. The tests will be based on observational weather series from several European stations available from the ECA&D database. Acknowledgements: The weather generator is developed and validated within the frame of projects WG4VALUE (sponsored by the Ministry of Education, Youth and Sports of CR), and VALUE (COST ES 1102 action).
Chen, Jyun-Hong; Ou, Huang-Tz; Lin, Tzu-Chieh; Lai, Edward Chia-Cheng; Kao, Yea-Huei Yang
2016-02-01
Care of the elderly with diabetes is more complicated than that for other age groups. The elderly and/or those with multiple comorbidities are often excluded from randomized controlled trials of treatments for diabetes. The heterogeneity of health status of the elderly also increases the difficulty in diabetes care; therefore, diabetes care for the elderly should be individualized. Motivated patients educated about diabetes benefit the most from collaborating with a multidisciplinary patient-care team. A pharmacist is an important team member by serving as an educator, coach, healthcare manager, and pharmaceutical care provider. To evaluate the effects of pharmaceutical care on glycemic control of ambulatory elderly patients with type 2 diabetes. A 421-bed district hospital in Nantou City, Taiwan. We conducted a randomized controlled clinical trial involving 100 patients with type 2 diabetes with poor glycemic control (HbA1c levels of ≥9.0 %) aged ≥65 years over 6 months. Participants were randomly assigned to a standard-care (control, n = 50) or pharmaceutical-care (intervention, n = 50) group. Pharmaceutical care was provided by a certified diabetes-educator pharmacist who identified and resolved drug-related problems and established a procedure for consultations pertaining to medication. The Mann–Whitney test was used to evaluate nonparametric quantitative data. Statistical significance was defined as P < 0.05. The change in the mean HbA1c level from the baseline to the next level within 6 months after recruiting. Nonparametric data (Mann–Whitney test) showed that the mean HbA1c level significantly decreased (0.83 %) after 6 months for the intervention group compared with an increase of 0.43 % for the control group (P ≤ 0.001). Medical expenses between groups did not significantly differ (−624.06 vs. −418.7, P = 0.767). There was no significant difference in hospitalization rates between groups. The pharmacist intervention program provided pharmaceutical services that improved long-term, safe control of blood sugar levels for ambulatory elderly patients with diabetes and did not increase medical expenses.
Nonparametric analysis of Minnesota spruce and aspen tree data and LANDSAT data
NASA Technical Reports Server (NTRS)
Scott, D. W.; Jee, R.
1984-01-01
The application of nonparametric methods in data-intensive problems faced by NASA is described. The theoretical development of efficient multivariate density estimators and the novel use of color graphics workstations are reviewed. The use of nonparametric density estimates for data representation and for Bayesian classification are described and illustrated. Progress in building a data analysis system in a workstation environment is reviewed and preliminary runs presented.
ERIC Educational Resources Information Center
Sueiro, Manuel J.; Abad, Francisco J.
2011-01-01
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Pattin, Kristine A.; White, Bill C.; Barney, Nate; Gui, Jiang; Nelson, Heather H.; Kelsey, Karl R.; Andrew, Angeline S.; Karagas, Margaret R.; Moore, Jason H.
2008-01-01
Multifactor dimensionality reduction (MDR) was developed as a nonparametric and model-free data mining method for detecting, characterizing, and interpreting epistasis in the absence of significant main effects in genetic and epidemiologic studies of complex traits such as disease susceptibility. The goal of MDR is to change the representation of the data using a constructive induction algorithm to make nonadditive interactions easier to detect using any classification method such as naïve Bayes or logistic regression. Traditionally, MDR constructed variables have been evaluated with a naïve Bayes classifier that is combined with 10-fold cross validation to obtain an estimate of predictive accuracy or generalizability of epistasis models. Traditionally, we have used permutation testing to statistically evaluate the significance of models obtained through MDR. The advantage of permutation testing is that it controls for false-positives due to multiple testing. The disadvantage is that permutation testing is computationally expensive. This is in an important issue that arises in the context of detecting epistasis on a genome-wide scale. The goal of the present study was to develop and evaluate several alternatives to large-scale permutation testing for assessing the statistical significance of MDR models. Using data simulated from 70 different epistasis models, we compared the power and type I error rate of MDR using a 1000-fold permutation test with hypothesis testing using an extreme value distribution (EVD). We find that this new hypothesis testing method provides a reasonable alternative to the computationally expensive 1000-fold permutation test and is 50 times faster. We then demonstrate this new method by applying it to a genetic epidemiology study of bladder cancer susceptibility that was previously analyzed using MDR and assessed using a 1000-fold permutation test. PMID:18671250
NASA Astrophysics Data System (ADS)
Henebry, G. M.; Valle De Carvalho E Oliveira, P.; Zheng, B.; de Beurs, K.; Owsley, B.
2015-12-01
In our current era of intensive earth observation the time is ripe to shift away from studies relying on single sensors or single products to the synergistic use of multiple sensors and products at complementary spatial, temporal, and spectral scales. The use of multiple time series can not only reveal hotspots of change in land surface dynamics, but can indicate plausible proximate causes of the changes and suggest their possible consequences. Here we explore recent trends in the land surface dynamics of exemplary semi-arid grasslands in the western hemisphere, including the shortgrass prairie of eastern Colorado and New Mexico, the sandhills prairie of Nebraska, the "savana gramineo-lenhosa" variety of cerrado in central Brazil, and the pampas of Argentina. Observational datasets include (1) NBAR-based vegetation indices, land surface temperature, and evapotranspiration from MODIS, (2) air temperature, water vapor, and vegetation optical depth from AMSR-E and AMSR2, (3) surface air temperature, water vapor, and relative humidity from AIRS, and (4) surface shortwave, longwave, and total net flux from CERES. The spatial resolutions of these nested data include 500 m, 1000 m, 0.05 degree, 25 km, and 1 degree. We apply the nonparametric Seasonal Kendall trend test to each time series independently to identify areas of significant change. We then examine polygons of co-occurrence of significant change in two or more types of products using the surface radiation and energy budgets as guides to interpret the multiple changes. Changes occurring across broad areas are more likely to be of climatic origin; whereas, changes that are abrupt in space and time and of limited area are more likely anthropogenic. Results illustrate the utility of considering multiple remote sensing products as complementary views of land surface dynamics.
One-shot estimate of MRMC variance: AUC.
Gallas, Brandon D
2006-03-01
One popular study design for estimating the area under the receiver operating characteristic curve (AUC) is the one in which a set of readers reads a set of cases: a fully crossed design in which every reader reads every case. The variability of the subsequent reader-averaged AUC has two sources: the multiple readers and the multiple cases (MRMC). In this article, we present a nonparametric estimate for the variance of the reader-averaged AUC that is unbiased and does not use resampling tools. The one-shot estimate is based on the MRMC variance derived by the mechanistic approach of Barrett et al. (2005), as well as the nonparametric variance of a single-reader AUC derived in the literature on U statistics. We investigate the bias and variance properties of the one-shot estimate through a set of Monte Carlo simulations with simulated model observers and images. The different simulation configurations vary numbers of readers and cases, amounts of image noise and internal noise, as well as how the readers are constructed. We compare the one-shot estimate to a method that uses the jackknife resampling technique with an analysis of variance model at its foundation (Dorfman et al. 1992). The name one-shot highlights that resampling is not used. The one-shot and jackknife estimators behave similarly, with the one-shot being marginally more efficient when the number of cases is small. We have derived a one-shot estimate of the MRMC variance of AUC that is based on a probabilistic foundation with limited assumptions, is unbiased, and compares favorably to an established estimate.
A Machine Learning Framework for Plan Payment Risk Adjustment.
Rose, Sherri
2016-12-01
To introduce cross-validation and a nonparametric machine learning framework for plan payment risk adjustment and then assess whether they have the potential to improve risk adjustment. 2011-2012 Truven MarketScan database. We compare the performance of multiple statistical approaches within a broad machine learning framework for estimation of risk adjustment formulas. Total annual expenditure was predicted using age, sex, geography, inpatient diagnoses, and hierarchical condition category variables. The methods included regression, penalized regression, decision trees, neural networks, and an ensemble super learner, all in concert with screening algorithms that reduce the set of variables considered. The performance of these methods was compared based on cross-validated R 2 . Our results indicate that a simplified risk adjustment formula selected via this nonparametric framework maintains much of the efficiency of a traditional larger formula. The ensemble approach also outperformed classical regression and all other algorithms studied. The implementation of cross-validated machine learning techniques provides novel insight into risk adjustment estimation, possibly allowing for a simplified formula, thereby reducing incentives for increased coding intensity as well as the ability of insurers to "game" the system with aggressive diagnostic upcoding. © Health Research and Educational Trust.
Robust Machine Learning Variable Importance Analyses of Medical Conditions for Health Care Spending.
Rose, Sherri
2018-03-11
To propose nonparametric double robust machine learning in variable importance analyses of medical conditions for health spending. 2011-2012 Truven MarketScan database. I evaluate how much more, on average, commercially insured enrollees with each of 26 of the most prevalent medical conditions cost per year after controlling for demographics and other medical conditions. This is accomplished within the nonparametric targeted learning framework, which incorporates ensemble machine learning. Previous literature studying the impact of medical conditions on health care spending has almost exclusively focused on parametric risk adjustment; thus, I compare my approach to parametric regression. My results demonstrate that multiple sclerosis, congestive heart failure, severe cancers, major depression and bipolar disorders, and chronic hepatitis are the most costly medical conditions on average per individual. These findings differed from those obtained using parametric regression. The literature may be underestimating the spending contributions of several medical conditions, which is a potentially critical oversight. If current methods are not capturing the true incremental effect of medical conditions, undesirable incentives related to care may remain. Further work is needed to directly study these issues in the context of federal formulas. © Health Research and Educational Trust.
Zhang, Hanze; Huang, Yangxin; Wang, Wei; Chen, Henian; Langland-Orban, Barbara
2017-01-01
In longitudinal AIDS studies, it is of interest to investigate the relationship between HIV viral load and CD4 cell counts, as well as the complicated time effect. Most of common models to analyze such complex longitudinal data are based on mean-regression, which fails to provide efficient estimates due to outliers and/or heavy tails. Quantile regression-based partially linear mixed-effects models, a special case of semiparametric models enjoying benefits of both parametric and nonparametric models, have the flexibility to monitor the viral dynamics nonparametrically and detect the varying CD4 effects parametrically at different quantiles of viral load. Meanwhile, it is critical to consider various data features of repeated measurements, including left-censoring due to a limit of detection, covariate measurement error, and asymmetric distribution. In this research, we first establish a Bayesian joint models that accounts for all these data features simultaneously in the framework of quantile regression-based partially linear mixed-effects models. The proposed models are applied to analyze the Multicenter AIDS Cohort Study (MACS) data. Simulation studies are also conducted to assess the performance of the proposed methods under different scenarios.
Divine, George; Norton, H James; Hunt, Ronald; Dienemann, Jacqueline
2013-09-01
When a study uses an ordinal outcome measure with unknown differences in the anchors and a small range such as 4 or 7, use of the Wilcoxon rank sum test or the Wilcoxon signed rank test may be most appropriate. However, because nonparametric methods are at best indirect functions of standard measures of location such as means or medians, the choice of the most appropriate summary measure can be difficult. The issues underlying use of these tests are discussed. The Wilcoxon-Mann-Whitney odds directly reflects the quantity that the rank sum procedure actually tests, and thus it can be a superior summary measure. Unlike the means and medians, its value will have a one-to-one correspondence with the Wilcoxon rank sum test result. The companion article appearing in this issue of Anesthesia & Analgesia ("Aromatherapy as Treatment for Postoperative Nausea: A Randomized Trial") illustrates these issues and provides an example of a situation for which the medians imply no difference between 2 groups, even though the groups are, in fact, quite different. The trial cited also provides an example of a single sample that has a median of zero, yet there is a substantial shift for much of the nonzero data, and the Wilcoxon signed rank test is quite significant. These examples highlight the potential discordance between medians and Wilcoxon test results. Along with the issues surrounding the choice of a summary measure, there are considerations for the computation of sample size and power, confidence intervals, and multiple comparison adjustment. In addition, despite the increased robustness of the Wilcoxon procedures relative to parametric tests, some circumstances in which the Wilcoxon tests may perform poorly are noted, along with alternative versions of the procedures that correct for such limitations.
NASA Astrophysics Data System (ADS)
Gentry, D.; Amador, E. S.; Cable, M. L.; Cantrell, T.; Chaudry, N.; Duca, Z. A.; Jacobsen, M. B.; Kirby, J.; McCaig, H. C.; Murukesan, G.; Rader, E.; Cullen, T.; Rennie, V.; Schwieterman, E. W.; Stevens, A. H.; Sutton, S. A.; Tan, G.; Yin, C.; Cullen, D.; Geppert, W.; Stockton, A. M.
2017-12-01
Studies in planetary analogue sites correlating remote imagery, mineralogy, and biomarker assay results help predict biomarker distribution and preservation. The FELDSPAR team has conducted five expeditions (2012-2017) to Icelandic Mars analogue sites with an increasingly refined battery of physicochemical measurements and biomarker assays. Two additional expeditions are planned; here we report intermediate results.The biomarker assays performed represent a diversity of potential biomarker types: ATP, cell counts, qPCR with domain-level primers, and DNA content. Mineralogical, chemical, and physical measurements and observations include temperature, pH, moisture content, and Raman, near-IR reflectance, and X-ray fluorescence spectra. Sites are geologically recent basaltic lava flows (Fimmvörðuháls, Eldfell, Holuhraun) and barren basaltic sand plains (Mælifellssandur, Dyngjusandur). All samples were 'homogeneous' at the 1 m to 1 km scale in apparent color, morphology, and grain size.[1]Sample locations were arranged in hierarchically nested grids at 10 cm, 1 m, 10 m, 100 m, and >1 km scales. Several measures of spatial distribution and variability were derived: unbiased sample variance, F- and pairwise t-tests with Bonferroni correction, and the non-parametric H- and u-tests. All assay results, including preliminary mineralogical information in the form of notable spectral bands, were then tested for correlation using the non-parametric Spearman's rank test.[2] For Fimmvörðuháls, four years of data were also examined for temporal trends.Biomarker quantification (other than cell count) was generally well correlated, although all assays showed notable variability even at the smallest examined spatial scale. Pairwise comparisons proved to be the most intuitive measure of variability; non-parametric characterization indicated trends at the >100 m scale, but required more replicates than were feasible at smaller scales. Future work will integrate additional mineralogical measurements and more specialized modeling. [1] Amador, E. S. et al. (2015) Planet. Space Sci., 106 1-10. [2] Gentry, D. M. et al. (2017) Astrobio., in press.
Benchmark dose analysis via nonparametric regression modeling
Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen
2013-01-01
Estimation of benchmark doses (BMDs) in quantitative risk assessment traditionally is based upon parametric dose-response modeling. It is a well-known concern, however, that if the chosen parametric model is uncertain and/or misspecified, inaccurate and possibly unsafe low-dose inferences can result. We describe a nonparametric approach for estimating BMDs with quantal-response data based on an isotonic regression method, and also study use of corresponding, nonparametric, bootstrap-based confidence limits for the BMD. We explore the confidence limits’ small-sample properties via a simulation study, and illustrate the calculations with an example from cancer risk assessment. It is seen that this nonparametric approach can provide a useful alternative for BMD estimation when faced with the problem of parametric model uncertainty. PMID:23683057
Mathematical models for non-parametric inferences from line transect data
Burnham, K.P.; Anderson, D.R.
1976-01-01
A general mathematical theory of line transects is developed which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(0) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y I r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(0 I r).
Zhang, Qingyang
2018-05-16
Differential co-expression analysis, as a complement of differential expression analysis, offers significant insights into the changes in molecular mechanism of different phenotypes. A prevailing approach to detecting differentially co-expressed genes is to compare Pearson's correlation coefficients in two phenotypes. However, due to the limitations of Pearson's correlation measure, this approach lacks the power to detect nonlinear changes in gene co-expression which is common in gene regulatory networks. In this work, a new nonparametric procedure is proposed to search differentially co-expressed gene pairs in different phenotypes from large-scale data. Our computational pipeline consisted of two main steps, a screening step and a testing step. The screening step is to reduce the search space by filtering out all the independent gene pairs using distance correlation measure. In the testing step, we compare the gene co-expression patterns in different phenotypes by a recently developed edge-count test. Both steps are distribution-free and targeting nonlinear relations. We illustrate the promise of the new approach by analyzing the Cancer Genome Atlas data and the METABRIC data for breast cancer subtypes. Compared with some existing methods, the new method is more powerful in detecting nonlinear type of differential co-expressions. The distance correlation screening can greatly improve computational efficiency, facilitating its application to large data sets.
Observed changes in relative humidity and dew point temperature in coastal regions of Iran
NASA Astrophysics Data System (ADS)
Hosseinzadeh Talaee, P.; Sabziparvar, A. A.; Tabari, Hossein
2012-12-01
The analysis of trends in hydroclimatic parameters and assessment of their statistical significance have recently received a great concern to clarify whether or not there is an obvious climate change. In the current study, parametric linear regression and nonparametric Mann-Kendall tests were applied for detecting annual and seasonal trends in the relative humidity (RH) and dew point temperature ( T dew) time series at ten coastal weather stations in Iran during 1966-2005. The serial structure of the data was considered, and the significant serial correlations were eliminated using the trend-free pre-whitening method. The results showed that annual RH increased by 1.03 and 0.28 %/decade at the northern and southern coastal regions of the country, respectively, while annual T dew increased by 0.29 and 0.15°C per decade at the northern and southern regions, respectively. The significant trends were frequent in the T dew series, but they were observed only at 2 out of the 50 RH series. The results showed that the difference between the results of the parametric and nonparametric tests was small, although the parametric test detected larger significant trends in the RH and T dew time series. Furthermore, the differences between the results of the trend tests were not related to the normality of the statistical distribution.
Ren, Wen-Long; Wen, Yang-Jun; Dunwell, Jim M; Zhang, Yuan-Ming
2018-03-01
Although nonparametric methods in genome-wide association studies (GWAS) are robust in quantitative trait nucleotide (QTN) detection, the absence of polygenic background control in single-marker association in genome-wide scans results in a high false positive rate. To overcome this issue, we proposed an integrated nonparametric method for multi-locus GWAS. First, a new model transformation was used to whiten the covariance matrix of polygenic matrix K and environmental noise. Using the transferred model, Kruskal-Wallis test along with least angle regression was then used to select all the markers that were potentially associated with the trait. Finally, all the selected markers were placed into multi-locus model, these effects were estimated by empirical Bayes, and all the nonzero effects were further identified by a likelihood ratio test for true QTN detection. This method, named pKWmEB, was validated by a series of Monte Carlo simulation studies. As a result, pKWmEB effectively controlled false positive rate, although a less stringent significance criterion was adopted. More importantly, pKWmEB retained the high power of Kruskal-Wallis test, and provided QTN effect estimates. To further validate pKWmEB, we re-analyzed four flowering time related traits in Arabidopsis thaliana, and detected some previously reported genes that were not identified by the other methods.
Serum and Plasma Metabolomic Biomarkers for Lung Cancer.
Kumar, Nishith; Shahjaman, Md; Mollah, Md Nurul Haque; Islam, S M Shahinul; Hoque, Md Aminul
2017-01-01
In drug invention and early disease prediction of lung cancer, metabolomic biomarker detection is very important. Mortality rate can be decreased, if cancer is predicted at the earlier stage. Recent diagnostic techniques for lung cancer are not prognosis diagnostic techniques. However, if we know the name of the metabolites, whose intensity levels are considerably changing between cancer subject and control subject, then it will be easy to early diagnosis the disease as well as to discover the drug. Therefore, in this paper we have identified the influential plasma and serum blood sample metabolites for lung cancer and also identified the biomarkers that will be helpful for early disease prediction as well as for drug invention. To identify the influential metabolites, we considered a parametric and a nonparametric test namely student׳s t-test as parametric and Kruskal-Wallis test as non-parametric test. We also categorized the up-regulated and down-regulated metabolites by the heatmap plot and identified the biomarkers by support vector machine (SVM) classifier and pathway analysis. From our analysis, we got 27 influential (p-value<0.05) metabolites from plasma sample and 13 influential (p-value<0.05) metabolites from serum sample. According to the importance plot through SVM classifier, pathway analysis and correlation network analysis, we declared 4 metabolites (taurine, aspertic acid, glutamine and pyruvic acid) as plasma biomarker and 3 metabolites (aspartic acid, taurine and inosine) as serum biomarker.
NASA Astrophysics Data System (ADS)
Herrera-Grimaldi, Pascual; García-Marín, Amanda; Ayuso-Muñoz, José Luís; Flamini, Alessia; Morbidelli, Renato; Ayuso-Ruíz, José Luís
2018-02-01
The increase of air surface temperature at global scale is a fact with values around 0.85 °C since the late nineteen century. Nevertheless, the increase is not equally distributed all over the world, varying from one region to others. Thus, it becomes interesting to study the evolution of temperature indices for a certain area in order to analyse the existence of climatic trend in it. In this work, monthly temperature time series from two Mediterranean areas are used: the Umbria region in Italy, and the Guadalquivir Valley in southern Spain. For the available stations, six temperature indices (three annual and three monthly) of mean, average maximum and average minimum temperature have been obtained, and the existence of trends has been studied by applying the non-parametric Mann-Kendall test. Both regions show a general increase in all temperature indices, being the pattern of the trends clearer in Spain than in Italy. The Italian area is the only one at which some negative trends are detected. The presence of break points in the temperature series has been also studied by using the non-parametric Pettit test and the parametric standard normal homogeneity test (SNHT), most of which may be due to natural phenomena.
Charm: Cosmic history agnostic reconstruction method
NASA Astrophysics Data System (ADS)
Porqueres, Natalia; Ensslin, Torsten A.
2017-03-01
Charm (cosmic history agnostic reconstruction method) reconstructs the cosmic expansion history in the framework of Information Field Theory. The reconstruction is performed via the iterative Wiener filter from an agnostic or from an informative prior. The charm code allows one to test the compatibility of several different data sets with the LambdaCDM model in a non-parametric way.
The effect of monetary incentives on absenteeism: a case study
Charles H. Wolf
1974-01-01
An attendance bonus paid by a wood processing firm was studied to determine its effectiveness in reducing absenteeism. Employees were divided into permanent and short-term groups, and their response to the bonus was studied, using non-parametric tests. The evidence suggested that the incentive favorably influenced the work attendance of only the permanent group....
ERIC Educational Resources Information Center
Dimitrov, Dimiter M.
2007-01-01
The validation of cognitive attributes required for correct answers on binary test items or tasks has been addressed in previous research through the integration of cognitive psychology and psychometric models using parametric or nonparametric item response theory, latent class modeling, and Bayesian modeling. All previous models, each with their…
ERIC Educational Resources Information Center
Samejima, Fumiko
This paper is the final report of a multi-year project sponsored by the Office of Naval Research (ONR) in 1987 through 1990. The main objectives of the research summarized were to: investigate the non-parametric approach to the estimation of the operating characteristics of discrete item responses; revise and strengthen the package computer…
Stark, J.R.; Busch, J.P.; Deters, M.H.
1991-01-01
The Kruskil-Wallis test, a nonparametric that for 12 of the 21 constituents sampled in groups in the unconfined-drift aquifer, a of these constituents and land use was found statistical technique, indicated common in all land-use type relation between the concentration to be statistically significant.
Consequences of ignoring geologic variation in evaluating grazing impacts
Jonathan W. Long; Alvin L. Medina
2006-01-01
The geologic diversity of landforms in the Southwest complicates efforts to evaluate impacts of land uses such as livestock grazing. We examined a research study that evaluated relationships between trout biomass and stream habitat in the White Mountains of east-central Arizona. That study interpreted results of stepwise regressions and a nonparametric test of âgrazed...
Hong, Quan Nha; Coutu, Marie-France; Berbiche, Djamal
2017-01-01
The Work Role Functioning Questionnaire (WRFQ) was developed to assess workers' perceived ability to perform job demands and is used to monitor presenteeism. Still few studies on its validity can be found in the literature. The purpose of this study was to assess the items and factorial composition of the Canadian French version of the WRFQ (WRFQ-CF). Two measurement approaches were used to test the WRFQ-CF: Classical Test Theory (CTT) and non-parametric Item Response Theory (IRT). A total of 352 completed questionnaires were analyzed. A four-factor and three-factor model models were tested and shown respectively good fit with 14 items (Root Mean Square Error of Approximation (RMSEA) = 0.06, Standardized Root Mean Square Residual (SRMR) = 0.04, Bentler Comparative Fit Index (CFI) = 0.98) and with 17 items (RMSEA = 0.059, SRMR = 0.048, CFI = 0.98). Using IRT, 13 problematic items were identified, of which 9 were common with CTT. This study tested different models with fewer problematic items found in a three-factor model. Using a non-parametric IRT and CTT for item purification gave complementary results. IRT is still scarcely used and can be an interesting alternative method to enhance the quality of a measurement instrument. More studies are needed on the WRFQ-CF to refine its items and factorial composition.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
Ruiz, Jennifer; Labas, Michele P; Triche, Elizabeth W; Lo, Albert C
2013-12-01
The majority of persons with multiple sclerosis (MS) experience problems with gait, which they characterize as highly disabling impairments that adversely impact their quality of life. Thus, it is crucial to develop effective therapies to improve mobility for these individuals. The purpose of this study was to determine whether combination gait training, using robot-assisted treadmill training followed by conventional body-weight-supported treadmill training within the same session, improved gait and balance in individuals with MS. This study tested combination gait training in 7 persons with MS. The participants were randomized into the immediate therapy group (IT group) or the delayed therapy group (DT group). In phase I of the trial, the IT group received treatment while the DT group served as a concurrent comparison group. In phase II of the trial, the DT group received treatment identical to the treatment received by the IT group in phase I. Outcome measures included the 6-Minute Walk Test (6MWT), the Timed 25-Foot Walk Test, velocity, cadence, and the Functional Reach Test (FRT). Nonparametric statistical techniques were used for analysis. Combination gait training resulted in significantly greater improvements in the 6MWT for the IT group (median change = +59 m) compared with Phase I DT group (median change = -8 m) (P = 0.08) and FRT (median change = +3.3 cm in IT vs -0.8 cm in the DT group phase I; P = 0.03). Significant overall pre-post improvements following combination gait training were found in 6MWT (+32 m; P = 0.02) and FRT (+3.3 cm; P = 0.06) for IT and Phase II DT groups combined. Combination of robot with body-weight-supported treadmill training gait training is feasible and improved 6MWT and FRT distances in persons with MS.Video Abstract available (see Video, Supplemental Digital Content 1, http://links.lww.com/JNPT/A62) for more insights from the authors.
Jonsdottir, Johanna; Bertoni, Rita; Lawo, Michael; Montesano, Angelo; Bowman, Thomas; Gabrielli, Silvia
2018-01-01
The feasibility and preliminary evidence for efficacy of a serious games platform compared to exergame using the Wii for arm rehabilitation in persons with multiple sclerosis (MS) was investigated. A pilot single-blind randomized (2:1) controlled in clinic trial was carried out. Sixteen persons with MS participated (age years 56.8 (SD 12.3), MS-onset years 19.4 (SD 12.3), EDSS 6.5). Ten participants used a serious games platform (Rehab@Home) while 6 participants played with the commercial Wii platform, for four weeks (40min, 12 sessions/4 weeks). Feasibility and user experience measures were collected. Primary outcomes were the 9 Hole Peg Test (9HPT) and the Box and Block test (BBT). Secondary outcomes were the EQ-5D visual analogue scale (EQ-VAS) and the SF-12. Nonparametric analysis was used to verify changes from pre to post rehabilitation within group and treatment effect was verified with Mann-Whitney U test. P value was set at 0.10 and clinical improvement was set at 20% improvement from baseline. Serious games were perceived positively in terms of user experience and motivation. There were clinically significant improvements in arm function in the serious games group as measured by 9HPT (38-29.5s, P = 0.046, > 20%) and BBT 32-42 cubes, P = 0.19, > 20%) following the 12 gaming sessions while the exergame group did not improve on either test (9HPT 34.5-41.5s, P = 0.34; BBT 38,5 to 42 cubes, P = 0.34). Only the exergame group perceived themselves as having improved their health. There was a significant between groups treatment effect only in perception of health (EQ-VAS) (Z = 1.93, P = 0.06) favouring the exergame group. Virtual reality in a serious gaming approach was feasible and beneficial to arm function of persons with MS but motivational aspects of the approach may need further attention. Copyright © 2017 Elsevier B.V. All rights reserved.
A Hybrid Index for Characterizing Drought Based on a Nonparametric Kernel Estimator
DOE Office of Scientific and Technical Information (OSTI.GOV)
Huang, Shengzhi; Huang, Qiang; Leng, Guoyong
This study develops a nonparametric multivariate drought index, namely, the Nonparametric Multivariate Standardized Drought Index (NMSDI), by considering the variations of both precipitation and streamflow. Building upon previous efforts in constructing Nonparametric Multivariate Drought Index, we use the nonparametric kernel estimator to derive the joint distribution of precipitation and streamflow, thus providing additional insights in drought index development. The proposed NMSDI are applied in the Wei River Basin (WRB), based on which the drought evolution characteristics are investigated. Results indicate: (1) generally, NMSDI captures the drought onset similar to Standardized Precipitation Index (SPI) and drought termination and persistence similar tomore » Standardized Streamflow Index (SSFI). The drought events identified by NMSDI match well with historical drought records in the WRB. The performances are also consistent with that by an existing Multivariate Standardized Drought Index (MSDI) at various timescales, confirming the validity of the newly constructed NMSDI in drought detections (2) An increasing risk of drought has been detected for the past decades, and will be persistent to a certain extent in future in most areas of the WRB; (3) the identified change points of annual NMSDI are mainly concentrated in the early 1970s and middle 1990s, coincident with extensive water use and soil reservation practices. This study highlights the nonparametric multivariable drought index, which can be used for drought detections and predictions efficiently and comprehensively.« less
Why preferring parametric forecasting to nonparametric methods?
Jabot, Franck
2015-05-07
A recent series of papers by Charles T. Perretti and collaborators have shown that nonparametric forecasting methods can outperform parametric methods in noisy nonlinear systems. Such a situation can arise because of two main reasons: the instability of parametric inference procedures in chaotic systems which can lead to biased parameter estimates, and the discrepancy between the real system dynamics and the modeled one, a problem that Perretti and collaborators call "the true model myth". Should ecologists go on using the demanding parametric machinery when trying to forecast the dynamics of complex ecosystems? Or should they rely on the elegant nonparametric approach that appears so promising? It will be here argued that ecological forecasting based on parametric models presents two key comparative advantages over nonparametric approaches. First, the likelihood of parametric forecasting failure can be diagnosed thanks to simple Bayesian model checking procedures. Second, when parametric forecasting is diagnosed to be reliable, forecasting uncertainty can be estimated on virtual data generated with the fitted to data parametric model. In contrast, nonparametric techniques provide forecasts with unknown reliability. This argumentation is illustrated with the simple theta-logistic model that was previously used by Perretti and collaborators to make their point. It should convince ecologists to stick to standard parametric approaches, until methods have been developed to assess the reliability of nonparametric forecasting. Copyright © 2015 Elsevier Ltd. All rights reserved.
Statistical Methods for Magnetic Resonance Image Analysis with Applications to Multiple Sclerosis
NASA Astrophysics Data System (ADS)
Pomann, Gina-Maria
Multiple sclerosis (MS) is an immune-mediated neurological disease that causes disability and morbidity. In patients with MS, the accumulation of lesions in the white matter of the brain is associated with disease progression and worse clinical outcomes. In the first part of the dissertation, we present methodology to study to compare the brain anatomy between patients with MS and controls. A nonparametric testing procedure is proposed for testing the null hypothesis that two samples of curves observed at discrete grids and with noise have the same underlying distribution. We propose to decompose the curves using functional principal component analysis of an appropriate mixture process, which we refer to as marginal functional principal component analysis. This approach reduces the dimension of the testing problem in a way that enables the use of traditional nonparametric univariate testing procedures. The procedure is computationally efficient and accommodates different sampling designs. Numerical studies are presented to validate the size and power properties of the test in many realistic scenarios. In these cases, the proposed test is more powerful than its primary competitor. The proposed methodology is illustrated on a state-of-the art diffusion tensor imaging study, where the objective is to compare white matter tract profiles in healthy individuals and MS patients. In the second part of the thesis, we present methods to study the behavior of MS in the white matter of the brain. Breakdown of the blood-brain barrier in newer lesions is indicative of more active disease-related processes and is a primary outcome considered in clinical trials of treatments for MS. Such abnormalities in active MS lesions are evaluated in vivo using contrast-enhanced structural magnetic resonance imaging (MRI), during which patients receive an intravenous infusion of a costly magnetic contrast agent. In some instances, the contrast agents can have toxic effects. Recently, local image regression techniques have been shown to have modest performance for assessing the integrity of the blood-brain barrier based on imaging without contrast agents. These models have centered on the problem of cross-sectional classification in which patients are imaged at a single study visit and pre-contrast images are used to predict post-contrast imaging. In this paper, we extend these methods to incorporate historical imaging information, and we find the proposed model to exhibit improved performance. We further develop scan-stratified case-control sampling techniques that reduce the computational burden of local image regression models while respecting the low proportion of the brain that exhibits abnormal vascular permeability. In the third part of this thesis, we present methods to evaluate tissue damage in patients with MS. We propose a lag functional linear model to predict a functional response using multiple functional predictors observed at discrete grids with noise. Two procedures are proposed to estimate the regression parameter functions; 1) a semi-local smoothing approach using generalized cross-validation; and 2) a global smoothing approach using a restricted maximum likelihood framework. Numerical studies are presented to analyze predictive accuracy in many realistic scenarios. We find that the global smoothing approach results in higher predictive accuracy than the semi-local approach. The methods are employed to estimate a measure of tissue damage in patients with MS. In patients with MS, the myelin sheaths around the axons of the neurons in the brain and spinal cord are damaged. The model facilitates the use of commonly acquired imaging modalities to estimate a measure of tissue damage within lesions. The proposed model outperforms the cross-sectional models that do not account for temporal patterns of lesional development and repair.
A Kolmogorov-Smirnov test for the molecular clock based on Bayesian ensembles of phylogenies
Antoneli, Fernando; Passos, Fernando M.; Lopes, Luciano R.
2018-01-01
Divergence date estimates are central to understand evolutionary processes and depend, in the case of molecular phylogenies, on tests of molecular clocks. Here we propose two non-parametric tests of strict and relaxed molecular clocks built upon a framework that uses the empirical cumulative distribution (ECD) of branch lengths obtained from an ensemble of Bayesian trees and well known non-parametric (one-sample and two-sample) Kolmogorov-Smirnov (KS) goodness-of-fit test. In the strict clock case, the method consists in using the one-sample Kolmogorov-Smirnov (KS) test to directly test if the phylogeny is clock-like, in other words, if it follows a Poisson law. The ECD is computed from the discretized branch lengths and the parameter λ of the expected Poisson distribution is calculated as the average branch length over the ensemble of trees. To compensate for the auto-correlation in the ensemble of trees and pseudo-replication we take advantage of thinning and effective sample size, two features provided by Bayesian inference MCMC samplers. Finally, it is observed that tree topologies with very long or very short branches lead to Poisson mixtures and in this case we propose the use of the two-sample KS test with samples from two continuous branch length distributions, one obtained from an ensemble of clock-constrained trees and the other from an ensemble of unconstrained trees. Moreover, in this second form the test can also be applied to test for relaxed clock models. The use of a statistically equivalent ensemble of phylogenies to obtain the branch lengths ECD, instead of one consensus tree, yields considerable reduction of the effects of small sample size and provides a gain of power. PMID:29300759
Nonparametric regression applied to quantitative structure-activity relationships
Constans; Hirst
2000-03-01
Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
Warke, Kim; Al-Smadi, Jamal; Baxter, David; Walsh, Deirdre M; Lowe-Strong, Andrea S
2006-01-01
This study was designed to investigate the hypoalgesic effects of self-applied transcutaneous electrical nerve stimulation (TENS) on chronic low-back pain (LBP) in a multiple sclerosis (MS) population. Ninety participants with probable or definite MS (aged 21 to 78 y) presenting with chronic LBP were recruited and randomized into 3 groups (n=30 per group): (1) low-frequency TENS group (4 Hz, 200 micros); (2) high-frequency TENS group (110 Hz, 200 micros); and (3) placebo TENS. Participants self-applied TENS for 45 minutes, a minimum of twice daily, for 6 weeks. Outcome measures were recorded at weeks 1, 6, 10, and 32. Primary outcome measures included: Visual Analog Scale for average LBP and the McGill Pain Questionnaire. Secondary outcome measures included: Visual Analog Scale for worst and weekly LBP, back and leg spasm; Roland Morris Disability Questionnaire; Barthel Index; Rivermead Mobility Index; Multiple Sclerosis Quality of Life-54 Instrument, and a daily logbook. Data were analyzed blind using parametric and nonparametric tests, as appropriate. Results indicated a statistically significant interactive effect between groups for average LBP (P=0.008); 1-way analysis of covariance did not show any significant effects at any time point once a Bonferonni correction was applied (P>0.05). However, clinically important differences were observed in some of the outcome measures in both active treatment groups during the treatment and follow-up periods. Although not statistically significant, the observed effects may have implications for the clinical prescription and the use of TENS within this population.
Braithwaite, Susan S; Umpierrez, Guillermo E; Chase, J Geoffrey
2013-09-01
Group metrics are described to quantify blood glucose (BG) variability of hospitalized patients. The "multiplicative surrogate standard deviation" (MSSD) is the reverse-transformed group mean of the standard deviations (SDs) of the logarithmically transformed BG data set of each patient. The "geometric group mean" (GGM) is the reverse-transformed group mean of the means of the logarithmically transformed BG data set of each patient. Before reverse transformation is performed, the mean of means and mean of SDs each has its own SD, which becomes a multiplicative standard deviation (MSD) after reverse transformation. Statistical predictions and comparisons of parametric or nonparametric tests remain valid after reverse transformation. A subset of a previously published BG data set of 20 critically ill patients from the first 72 h of treatment under the SPRINT protocol was transformed logarithmically. After rank ordering according to the SD of the logarithmically transformed BG data of each patient, the cohort was divided into two equal groups, those having lower or higher variability. For the entire cohort, the GGM was 106 (÷/× 1.07) mg/dl, and MSSD was 1.24 (÷/× 1.07). For the subgroups having lower and higher variability, respectively, the GGM did not differ, 104 (÷/× 1.07) versus 109 (÷/× 1.07) mg/dl, but the MSSD differed, 1.17 (÷/× 1.03) versus 1.31 (÷/× 1.05), p = .00004. By using the MSSD with its MSD, groups can be characterized and compared according to glycemic variability of individual patient members. © 2013 Diabetes Technology Society.
In vivo determination of multiple indices of periodontal inflammation by optical spectroscopy
Liu, KZ; Xiang, XM; Man, A; Sowa, MG; Cholakis, N; Ghiabi, E; Singer, DL; Scott, DA
2008-01-01
Background and Objective Visible – near infrared (optical) spectroscopy can be used to measure regional tissue hemodynamics and edema and, therefore, may represent an ideal tool with which to non-invasively study periodontal inflammation. The study objective was to evaluate the ability of optical spectroscopy to simultaneously determine multiple inflammatory indices (tissue oxygenation, total tissue hemoglobin, deoxyhemoglobin, oxygenated hemoglobin, and tissue edema) in periodontal tissues in vivo. Material and Methods Spectra were obtained, processed, and evaluated from healthy, gingivitis, and periodontitis sites (n = 133) using a portable optical – near infrared spectrometer. A modified Beer-Lambert unmixing model that incorporates a nonparametric scattering loss function was used to determine the relative contribution of each inflammatory component to the overall spectrum. Results Optical spectroscopy was harnessed to successfully generate complex inflammatory profiles in periodontal tissues. Tissue oxygenation at periodontitis sites was significantly decreased (p<0.05) compared to gingivitis and healthy controls. This is largely due to an increase in deoxyhemoglobin in the periodontitis sites compared to healthy (p<0.01) and gingivitis (p=0.05) sites. Tissue water content per se showed no significant difference between the sites but a water index associated with tissue electrolyte levels and temperature differed was significantly between periodontitis sites when compared to both healthy and gingivitis sites (p<0.03). Conclusion This study establishes that optical spectroscopy can simultaneously determine multiple inflammatory indices directly in the periodontal tissues in vivo. Visible - near infrared spectroscopy has the potential to be developed into a simple, reagent-free, user friendly, chair-side, site-specific, diagnostic and prognostic test for periodontitis. PMID:18973538
Kang, Le; Chen, Weijie; Petrick, Nicholas A.; Gallas, Brandon D.
2014-01-01
The area under the receiver operating characteristic (ROC) curve (AUC) is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of AUC, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. PMID:25399736
Grell, Kathrine; Diggle, Peter J; Frederiksen, Kirsten; Schüz, Joachim; Cardis, Elisabeth; Andersen, Per K
2015-10-15
We study methods for how to include the spatial distribution of tumours when investigating the relation between brain tumours and the exposure from radio frequency electromagnetic fields caused by mobile phone use. Our suggested point process model is adapted from studies investigating spatial aggregation of a disease around a source of potential hazard in environmental epidemiology, where now the source is the preferred ear of each phone user. In this context, the spatial distribution is a distribution over a sample of patients rather than over multiple disease cases within one geographical area. We show how the distance relation between tumour and phone can be modelled nonparametrically and, with various parametric functions, how covariates can be included in the model and how to test for the effect of distance. To illustrate the models, we apply them to a subset of the data from the Interphone Study, a large multinational case-control study on the association between brain tumours and mobile phone use. Copyright © 2015 John Wiley & Sons, Ltd.
Calculating stage duration statistics in multistage diseases.
Komarova, Natalia L; Thalhauser, Craig J
2011-01-01
Many human diseases are characterized by multiple stages of progression. While the typical sequence of disease progression can be identified, there may be large individual variations among patients. Identifying mean stage durations and their variations is critical for statistical hypothesis testing needed to determine if treatment is having a significant effect on the progression, or if a new therapy is showing a delay of progression through a multistage disease. In this paper we focus on two methods for extracting stage duration statistics from longitudinal datasets: an extension of the linear regression technique, and a counting algorithm. Both are non-iterative, non-parametric and computationally cheap methods, which makes them invaluable tools for studying the epidemiology of diseases, with a goal of identifying different patterns of progression by using bioinformatics methodologies. Here we show that the regression method performs well for calculating the mean stage durations under a wide variety of assumptions, however, its generalization to variance calculations fails under realistic assumptions about the data collection procedure. On the other hand, the counting method yields reliable estimations for both means and variances of stage durations. Applications to Alzheimer disease progression are discussed.
Cervical shaping in curved root canals: comparison of the efficiency of two endodontic instruments.
Busquim, Sandra Soares Kühne; dos Santos, Marcelo
2002-01-01
The aim of this study was to determine the removal of dentin produced by number 25 (0.08) Flare files (Quantec Flare Series, Analytic Endodontics, Glendora, California, USA) and number 1 e 2 Gates-Glidden burs (Dentsply - Maillefer, Ballaigues, Switzerland), in the mesio-buccal and mesio-lingual root canals, respectively, of extracted human permanent inferior molars, by means of measuring the width of dentinal walls prior and after instrumentation. The obtained values were compared. Due to the multiple analyses of data, a nonparametric test was used, and the Kruskal-Wallis test was chosen. There was no significant difference between the instruments as to the removal of dentin in the 1st and 2nd millimeters. However, when comparing the performances of the instruments in the 3rd millimeter, Flare files promoted a greater removal than Gates-Glidden drills (p > 0.05). The analysis revealed no significant differences as to mesial wear, which demonstrates the similar behavior of both instruments. Gates-Glidden drills produced an expressive mesial detour in the 2nd and 3rd millimeters, which was detected trough a statistically significant difference in the wear of this region (p > 0.05). There was no statistically significant difference between mesial and lateral wear when Flare instruments were employed.
The association of malocclusion and trumpet performance.
Kula, Katherine; Cilingir, H Zeynep; Eckert, George; Dagg, Jack; Ghoneima, Ahmed
2015-04-20
To determine whether trumpet performance skills are associated with malocclusion. Following institutional review board approval, 70 university trumpet students (54 male, 16 female; aged 20-38.9 years) were consented. After completing a survey, the students were evaluated while playing a scripted performance skills test (flexibility, articulation, range, and endurance exercises) on their instrument in a soundproof music practice room. One investigator (trumpet teacher) used a computerized metronome and a decibel meter during evaluation. A three-dimensional (3D) cone-beam computerized tomography scan (CBCT) was taken of each student the same day as the skills test. Following reliability studies, multiple dental parameters were measured on the 3D CBCT. Nonparametric correlations (Spearman), accepting P < .05 as significant, were used to determine if there were significant associations between dental parameters and the performance skills. Intrarater reliability was excellent (intraclass correlations; all r values > .94). Although associations were weak to moderate, significant negative associations (r ≤ -.32) were found between Little's irregularity index, interincisal inclination, maxillary central incisor rotation, and various flexibility and articulation performance skills, whereas significant positive associations (r ≤ .49) were found between arch widths and various skills. Specific malocclusions are associated with trumpet performance of experienced young musicians. (Angle Orthod. 0000;00:000-000.).
Rasova, Kamila; Prochazkova, Marie; Tintera, Jaroslav; Ibrahim, Ibrahim; Zimova, Denisa; Stetkarova, Ivana
2015-03-01
There is still little scientific evidence for the efficacy of neurofacilitation approaches and their possible influence on brain plasticity and adaptability. In this study, the outcome of a new kind of neurofacilitation approach, motor programme activating therapy (MPAT), was evaluated on the basis of a set of clinical functions and with MRI. Eighteen patients were examined four times with standardized clinical tests and diffusion tensor imaging to monitor changes without therapy, immediately after therapy and 1 month after therapy. Moreover, the strength of effective connectivity was analysed before and after therapy. Patients underwent a 1-h session of MPAT twice a week for 2 months. The data were analysed by nonparametric tests of association and were subsequently statistically evaluated. The therapy led to significant improvement in clinical functions, significant increment of fractional anisotropy and significant decrement of mean diffusivity, and decrement of effective connectivity at supplementary motor areas was observed immediately after the therapy. Changes in clinical functions and diffusion tensor images persisted 1 month after completing the programme. No statistically significant changes in clinical functions and no differences in MRI-diffusion tensor images were observed without physiotherapy. Positive immediate and long-term effects of MPAT on clinical and brain functions, as well as brain microstructure, were confirmed.
Marmarelis, Vasilis Z.; Berger, Theodore W.
2009-01-01
Parametric and non-parametric modeling methods are combined to study the short-term plasticity (STP) of synapses in the central nervous system (CNS). The nonlinear dynamics of STP are modeled by means: (1) previously proposed parametric models based on mechanistic hypotheses and/or specific dynamical processes, and (2) non-parametric models (in the form of Volterra kernels) that transforms the presynaptic signals into postsynaptic signals. In order to synergistically use the two approaches, we estimate the Volterra kernels of the parametric models of STP for four types of synapses using synthetic broadband input–output data. Results show that the non-parametric models accurately and efficiently replicate the input–output transformations of the parametric models. Volterra kernels provide a general and quantitative representation of the STP. PMID:18506609
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
1984-09-28
variables before simula- tion of model - Search for reality checks a, - Express uncertainty as a probability density distribution. a. H2 a, H-22 TWIF... probability that the software con- tains errors. This prior is updated as test failure data are accumulated. Only a p of 1 (software known to contain...discusssed; both parametric and nonparametric versions are presented. It is shown by the author that the bootstrap underlies the jackknife method and
Nonparametric Bayesian Modeling for Automated Database Schema Matching
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferragut, Erik M; Laska, Jason A
2015-01-01
The problem of merging databases arises in many government and commercial applications. Schema matching, a common first step, identifies equivalent fields between databases. We introduce a schema matching framework that builds nonparametric Bayesian models for each field and compares them by computing the probability that a single model could have generated both fields. Our experiments show that our method is more accurate and faster than the existing instance-based matching algorithms in part because of the use of nonparametric Bayesian models.
To Math or Not to Math: The Algebra-Calculus Pipeline and Postsecondary Mathematics Remediation
ERIC Educational Resources Information Center
Showalter, Daniel A.
2017-01-01
This article reports on a study designed to estimate the effect of high school coursetaking in the algebra-calculus pipeline on the likelihood of placing out of postsecondary remedial mathematics. A nonparametric variant of propensity score analysis was used on a nationally representative data set to remove selection bias and test for an effect…
ERIC Educational Resources Information Center
Douglas, Jeff; Kim, Hae-Rim; Roussos, Louis; Stout, William; Zhang, Jinming
An extensive nonparametric dimensionality analysis of latent structure was conducted on three forms of the Law School Admission Test (LSAT) (December 1991, June 1992, and October 1992) using the DIMTEST model in confirmatory analyses and using DIMTEST, FAC, DETECT, HCA, PROX, and a genetic algorithm in exploratory analyses. Results indicate that…
On evaluating health centers groups in Lisbon and Tagus Valley: efficiency, equity and quality
2013-01-01
Background Bearing in mind the increasing health expenses and their weight in the Portuguese gross domestic product, it is of the utmost importance to evaluate the performance of Primary Health Care providers taking into account both efficiency, quality and equity. This paper aims to contribute to a better understanding of the performance of Primary Health Care by measuring it in a Portuguese region (Lisbon and Tagus Valley) and identifying best practices. It also intends to evaluate the quality and equity provided. Methods For the purpose of measuring the efficiency of the health care centers (ACES) the non-parametric full frontier technique of data envelopment analysis (DEA) was adopted. The recent partial frontier method of order-m was also used to estimate the influence of exogenous variables on the efficiency of the ACES. The horizontal equity was investigated by applying the non-parametric Kruskal-Wallis test with multiple comparisons. Moreover, the quality of service was analyzed by using the ratio between the complaints and the total activity of the ACES. Results On the whole, a significant level of inefficiency was observed, although there was a general improvement in efficiency between 2009 and 2010. It was found that nursing was the service with the lowest scores. Concerning the horizontal equity, the analysis showed that there is no evidence of relevant disparities between the different subregions(NUTS III). Concerning the exogenous variables, the purchasing power, the percentage of patients aged 65 years old or older and the population size affect the efficiency negatively. Conclusions This research shows that better usage of the available resources and the creation of a learning network and dissemination of best practices will contribute to improvements in the efficiency of the ACES while maintaining or even improving quality and equity. It was also proved that the market structure does matter when efficiency measurement is addressed. PMID:24359014
Uncertainty in determining extreme precipitation thresholds
NASA Astrophysics Data System (ADS)
Liu, Bingjun; Chen, Junfan; Chen, Xiaohong; Lian, Yanqing; Wu, Lili
2013-10-01
Extreme precipitation events are rare and occur mostly on a relatively small and local scale, which makes it difficult to set the thresholds for extreme precipitations in a large basin. Based on the long term daily precipitation data from 62 observation stations in the Pearl River Basin, this study has assessed the applicability of the non-parametric, parametric, and the detrended fluctuation analysis (DFA) methods in determining extreme precipitation threshold (EPT) and the certainty to EPTs from each method. Analyses from this study show the non-parametric absolute critical value method is easy to use, but unable to reflect the difference of spatial rainfall distribution. The non-parametric percentile method can account for the spatial distribution feature of precipitation, but the problem with this method is that the threshold value is sensitive to the size of rainfall data series and is subjected to the selection of a percentile thus make it difficult to determine reasonable threshold values for a large basin. The parametric method can provide the most apt description of extreme precipitations by fitting extreme precipitation distributions with probability distribution functions; however, selections of probability distribution functions, the goodness-of-fit tests, and the size of the rainfall data series can greatly affect the fitting accuracy. In contrast to the non-parametric and the parametric methods which are unable to provide information for EPTs with certainty, the DFA method although involving complicated computational processes has proven to be the most appropriate method that is able to provide a unique set of EPTs for a large basin with uneven spatio-temporal precipitation distribution. The consistency between the spatial distribution of DFA-based thresholds with the annual average precipitation, the coefficient of variation (CV), and the coefficient of skewness (CS) for the daily precipitation further proves that EPTs determined by the DFA method are more reasonable and applicable for the Pearl River Basin.
Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob
2016-08-01
The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
Multiple imputation in the presence of non-normal data.
Lee, Katherine J; Carlin, John B
2017-02-20
Multiple imputation (MI) is becoming increasingly popular for handling missing data. Standard approaches for MI assume normality for continuous variables (conditionally on the other variables in the imputation model). However, it is unclear how to impute non-normally distributed continuous variables. Using simulation and a case study, we compared various transformations applied prior to imputation, including a novel non-parametric transformation, to imputation on the raw scale and using predictive mean matching (PMM) when imputing non-normal data. We generated data from a range of non-normal distributions, and set 50% to missing completely at random or missing at random. We then imputed missing values on the raw scale, following a zero-skewness log, Box-Cox or non-parametric transformation and using PMM with both type 1 and 2 matching. We compared inferences regarding the marginal mean of the incomplete variable and the association with a fully observed outcome. We also compared results from these approaches in the analysis of depression and anxiety symptoms in parents of very preterm compared with term-born infants. The results provide novel empirical evidence that the decision regarding how to impute a non-normal variable should be based on the nature of the relationship between the variables of interest. If the relationship is linear in the untransformed scale, transformation can introduce bias irrespective of the transformation used. However, if the relationship is non-linear, it may be important to transform the variable to accurately capture this relationship. A useful alternative is to impute the variable using PMM with type 1 matching. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Gini estimation under infinite variance
NASA Astrophysics Data System (ADS)
Fontanari, Andrea; Taleb, Nassim Nicholas; Cirillo, Pasquale
2018-07-01
We study the problems related to the estimation of the Gini index in presence of a fat-tailed data generating process, i.e. one in the stable distribution class with finite mean but infinite variance (i.e. with tail index α ∈(1 , 2)). We show that, in such a case, the Gini coefficient cannot be reliably estimated using conventional nonparametric methods, because of a downward bias that emerges under fat tails. This has important implications for the ongoing discussion about economic inequality. We start by discussing how the nonparametric estimator of the Gini index undergoes a phase transition in the symmetry structure of its asymptotic distribution, as the data distribution shifts from the domain of attraction of a light-tailed distribution to that of a fat-tailed one, especially in the case of infinite variance. We also show how the nonparametric Gini bias increases with lower values of α. We then prove that maximum likelihood estimation outperforms nonparametric methods, requiring a much smaller sample size to reach efficiency. Finally, for fat-tailed data, we provide a simple correction mechanism to the small sample bias of the nonparametric estimator based on the distance between the mode and the mean of its asymptotic distribution.
Carmignani, Lucio O; Pedro, Adriana Orcesi; Montemor, Eliana B; Arias, Victor A; Costa-Paiva, Lucia H; Pinto-Neto, Aarão M
2015-07-01
This study aims to compare the effects of a soy-based dietary supplement, low-dose hormone therapy (HT), and placebo on the urogenital system in postmenopausal women. In this double-blind, randomized, placebo-controlled trial, 60 healthy postmenopausal women aged 40 to 60 years (mean time since menopause, 4.1 y) were randomized into three groups: a soy dietary supplement group (90 mg of isoflavone), a low-dose HT group (1 mg of estradiol plus 0.5 mg of norethisterone), and a placebo group. Urinary, vaginal, and sexual complaints were evaluated using the urogenital subscale of the Menopause Rating Scale. Vaginal maturation value was calculated. Transvaginal sonography was performed to evaluate endometrial thickness. Genital bleeding pattern was assessed. Statistical analysis was performed using χ(2) test, Fisher's exact test, paired Student's t test, Kruskal-Wallis test, Kruskal-Wallis nonparametric test, and analysis of variance. For intergroup comparisons, Kruskal-Wallis nonparametric test (followed by Mann-Whitney U test) was used. Vaginal dryness improved significantly in the soy and HT groups (P = 0.04). Urinary and sexual symptoms did not change with treatment in the three groups. After 16 weeks of treatment, there was a significant increase in maturation value only in the HT group (P < 0.01). Vaginal pH decreased only in this group (P < 0.01). There were no statistically significant differences in endometrial thickness between the three groups, and the adverse effects evaluated were similar. This study shows that a soy-based dietary supplement used for 16 weeks fails to exert estrogenic action on the urogenital tract but improves vaginal dryness.
Efficient fault diagnosis of helicopter gearboxes
NASA Technical Reports Server (NTRS)
Chin, H.; Danai, K.; Lewicki, D. G.
1993-01-01
Application of a diagnostic system to a helicopter gearbox is presented. The diagnostic system is a nonparametric pattern classifier that uses a multi-valued influence matrix (MVIM) as its diagnostic model and benefits from a fast learning algorithm that enables it to estimate its diagnostic model from a small number of measurement-fault data. To test this diagnostic system, vibration measurements were collected from a helicopter gearbox test stand during accelerated fatigue tests and at various fault instances. The diagnostic results indicate that the MVIM system can accurately detect and diagnose various gearbox faults so long as they are included in training.
Goodness-of-Fit Tests and Nonparametric Adaptive Estimation for Spike Train Analysis
2014-01-01
When dealing with classical spike train analysis, the practitioner often performs goodness-of-fit tests to test whether the observed process is a Poisson process, for instance, or if it obeys another type of probabilistic model (Yana et al. in Biophys. J. 46(3):323–330, 1984; Brown et al. in Neural Comput. 14(2):325–346, 2002; Pouzat and Chaffiol in Technical report, http://arxiv.org/abs/arXiv:0909.2785, 2009). In doing so, there is a fundamental plug-in step, where the parameters of the supposed underlying model are estimated. The aim of this article is to show that plug-in has sometimes very undesirable effects. We propose a new method based on subsampling to deal with those plug-in issues in the case of the Kolmogorov–Smirnov test of uniformity. The method relies on the plug-in of good estimates of the underlying model that have to be consistent with a controlled rate of convergence. Some nonparametric estimates satisfying those constraints in the Poisson or in the Hawkes framework are highlighted. Moreover, they share adaptive properties that are useful from a practical point of view. We show the performance of those methods on simulated data. We also provide a complete analysis with these tools on single unit activity recorded on a monkey during a sensory-motor task. Electronic Supplementary Material The online version of this article (doi:10.1186/2190-8567-4-3) contains supplementary material. PMID:24742008
A hybrid method in combining treatment effects from matched and unmatched studies.
Byun, Jinyoung; Lai, Dejian; Luo, Sheng; Risser, Jan; Tung, Betty; Hardy, Robert J
2013-12-10
The most common data structures in the biomedical studies have been matched or unmatched designs. Data structures resulting from a hybrid of the two may create challenges for statistical inferences. The question may arise whether to use parametric or nonparametric methods on the hybrid data structure. The Early Treatment for Retinopathy of Prematurity study was a multicenter clinical trial sponsored by the National Eye Institute. The design produced data requiring a statistical method of a hybrid nature. An infant in this multicenter randomized clinical trial had high-risk prethreshold retinopathy of prematurity that was eligible for treatment in one or both eyes at entry into the trial. During follow-up, recognition visual acuity was accessed for both eyes. Data from both eyes (matched) and from only one eye (unmatched) were eligible to be used in the trial. The new hybrid nonparametric method is a meta-analysis based on combining the Hodges-Lehmann estimates of treatment effects from the Wilcoxon signed rank and rank sum tests. To compare the new method, we used the classic meta-analysis with the t-test method to combine estimates of treatment effects from the paired and two sample t-tests. We used simulations to calculate the empirical size and power of the test statistics, as well as the bias, mean square and confidence interval width of the corresponding estimators. The proposed method provides an effective tool to evaluate data from clinical trials and similar comparative studies. Copyright © 2013 John Wiley & Sons, Ltd.
Ikeya, Yoshimori; Fukuyama, Naoto; Mori, Hidezo
2015-03-01
N-3 fatty acids, including eicosapentaenoic acid (EPA), prevent ischemic stroke. The preventive effect has been attributed to an antithrombic effect induced by elevated EPA and reduced arachidonic acid (AA) levels. However, the relationship between intracranial hemorrhage and N-3 fatty acids has not yet been elucidated. In this cross-sectional study, we compared common clinical and lifestyle parameters between 70 patients with intracranial hemorrhages and 66 control subjects. The parameters included blood chemistry data, smoking, alcohol intake, fish consumption, and the incidences of underlying diseases. The comparisons were performed using the Mann-Whitney U test followed by multiple logistic regression analysis. Nonparametric tests revealed that the 70 patients with intracerebral hemorrhages exhibited significantly higher diastolic blood pressures and alcohol intakes and lower body mass indices, high-density lipoprotein (HDL) cholesterol levels, EPA concentrations, EPA/AA ratios, and vegetable consumption compared with the 66 control subjects. A multiple logistic regression analysis revealed that higher diastolic blood pressure and alcohol intake and lower body mass index, HDL cholesterol, EPA/AA ratio, and vegetable consumption were relative risk factors for intracerebral hemorrhage. High HDL cholesterol was a common risk factor in both of the sex-segregated subgroups and the <65-year-old subgroup. However, neither EPA nor the EPA/AA ratio was a risk factor in these subgroups. Eicosapentaenoic acid was relative risk factor only in the ≥65-year-old subgroup. Rather than higher EPA levels, lower EPA concentrations and EPA/AA ratios were found to be risk factors for intracerebral hemorrhage in addition to previously known risk factors such as blood pressure, alcohol consumption, and lifestyle. Copyright © 2015 Elsevier Inc. All rights reserved.
Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system
NASA Astrophysics Data System (ADS)
Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.
2018-02-01
In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.
Prentice, Ross L; Zhao, Shanshan
2018-01-01
The Dabrowska (Ann Stat 16:1475-1489, 1988) product integral representation of the multivariate survivor function is extended, leading to a nonparametric survivor function estimator for an arbitrary number of failure time variates that has a simple recursive formula for its calculation. Empirical process methods are used to sketch proofs for this estimator's strong consistency and weak convergence properties. Summary measures of pairwise and higher-order dependencies are also defined and nonparametrically estimated. Simulation evaluation is given for the special case of three failure time variates.
NASA Astrophysics Data System (ADS)
Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar
2015-06-01
In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.
Modeling Non-Gaussian Time Series with Nonparametric Bayesian Model.
Xu, Zhiguang; MacEachern, Steven; Xu, Xinyi
2015-02-01
We present a class of Bayesian copula models whose major components are the marginal (limiting) distribution of a stationary time series and the internal dynamics of the series. We argue that these are the two features with which an analyst is typically most familiar, and hence that these are natural components with which to work. For the marginal distribution, we use a nonparametric Bayesian prior distribution along with a cdf-inverse cdf transformation to obtain large support. For the internal dynamics, we rely on the traditionally successful techniques of normal-theory time series. Coupling the two components gives us a family of (Gaussian) copula transformed autoregressive models. The models provide coherent adjustments of time scales and are compatible with many extensions, including changes in volatility of the series. We describe basic properties of the models, show their ability to recover non-Gaussian marginal distributions, and use a GARCH modification of the basic model to analyze stock index return series. The models are found to provide better fit and improved short-range and long-range predictions than Gaussian competitors. The models are extensible to a large variety of fields, including continuous time models, spatial models, models for multiple series, models driven by external covariate streams, and non-stationary models.
A Comparison of Japan and U.K. SF-6D Health-State Valuations Using a Non-Parametric Bayesian Method.
Kharroubi, Samer A
2015-08-01
There is interest in the extent to which valuations of health may differ between different countries and cultures, but few studies have compared preference values of health states obtained in different countries. We sought to estimate and compare two directly elicited valuations for SF-6D health states between the Japan and U.K. general adult populations using Bayesian methods. We analysed data from two SF-6D valuation studies where, using similar standard gamble protocols, values for 241 and 249 states were elicited from representative samples of the Japan and U.K. general adult populations, respectively. We estimate a function applicable across both countries that explicitly accounts for the differences between them, and is estimated using data from both countries. The results suggest that differences in SF-6D health-state valuations between the Japan and U.K. general populations are potentially important. The magnitude of these country-specific differences in health-state valuation depended, however, in a complex way on the levels of individual dimensions. The new Bayesian non-parametric method is a powerful approach for analysing data from multiple nationalities or ethnic groups, to understand the differences between them and potentially to estimate the underlying utility functions more efficiently.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models.
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance.
Predicting Market Impact Costs Using Nonparametric Machine Learning Models
Park, Saerom; Lee, Jaewook; Son, Youngdoo
2016-01-01
Market impact cost is the most significant portion of implicit transaction costs that can reduce the overall transaction cost, although it cannot be measured directly. In this paper, we employed the state-of-the-art nonparametric machine learning models: neural networks, Bayesian neural network, Gaussian process, and support vector regression, to predict market impact cost accurately and to provide the predictive model that is versatile in the number of variables. We collected a large amount of real single transaction data of US stock market from Bloomberg Terminal and generated three independent input variables. As a result, most nonparametric machine learning models outperformed a-state-of-the-art benchmark parametric model such as I-star model in four error measures. Although these models encounter certain difficulties in separating the permanent and temporary cost directly, nonparametric machine learning models can be good alternatives in reducing transaction costs by considerably improving in prediction performance. PMID:26926235
Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend; Dørup, Jens; Lous, Jørgen
2016-01-01
We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also associated with the parents' and day care teachers' reports on the children's vocabulary, sentence construction, and pronunciation. Type B tympanograms were associated with a mean hearing 5-6dB below that of than type A, C1, or C2. In the graphic analysis, Galker scores were closely and significantly related to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding of spoken words in noise, and it correlated well with the day care teachers' reports and less with the parents' reports. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Chiu, Chun-Huo; Wang, Yi-Ting; Walther, Bruno A; Chao, Anne
2014-09-01
It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators. © 2014, The International Biometric Society.
Kang, Le; Chen, Weijie; Petrick, Nicholas A; Gallas, Brandon D
2015-02-20
The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. Copyright © 2014 John Wiley & Sons, Ltd.
Nonparametric Analyses of Log-Periodic Precursors to Financial Crashes
NASA Astrophysics Data System (ADS)
Zhou, Wei-Xing; Sornette, Didier
We apply two nonparametric methods to further test the hypothesis that log-periodicity characterizes the detrended price trajectory of large financial indices prior to financial crashes or strong corrections. The term "parametric" refers here to the use of the log-periodic power law formula to fit the data; in contrast, "nonparametric" refers to the use of general tools such as Fourier transform, and in the present case the Hilbert transform and the so-called (H, q)-analysis. The analysis using the (H, q)-derivative is applied to seven time series ending with the October 1987 crash, the October 1997 correction and the April 2000 crash of the Dow Jones Industrial Average (DJIA), the Standard & Poor 500 and Nasdaq indices. The Hilbert transform is applied to two detrended price time series in terms of the ln(tc-t) variable, where tc is the time of the crash. Taking all results together, we find strong evidence for a universal fundamental log-frequency f=1.02±0.05 corresponding to the scaling ratio λ=2.67±0.12. These values are in very good agreement with those obtained in earlier works with different parametric techniques. This note is extracted from a long unpublished report with 58 figures available at , which extensively describes the evidence we have accumulated on these seven time series, in particular by presenting all relevant details so that the reader can judge for himself or herself the validity and robustness of the results.
Fushing, Hsieh; Jordà, Òscar; Beisner, Brianne; McCowan, Brenda
2015-01-01
What do the behavior of monkeys in captivity and the financial system have in common? The nodes in such social systems relate to each other through multiple and keystone networks, not just one network. Each network in the system has its own topology, and the interactions among the system’s networks change over time. In such systems, the lead into a crisis appears to be characterized by a decoupling of the networks from the keystone network. This decoupling can also be seen in the crumbling of the keystone’s power structure toward a more horizontal hierarchy. This paper develops nonparametric methods for describing the joint model of the latent architecture of interconnected networks in order to describe this process of decoupling, and hence provide an early warning system of an impending crisis. PMID:26056422
Falk, Carl F; Cai, Li
2016-06-01
We present a semi-parametric approach to estimating item response functions (IRF) useful when the true IRF does not strictly follow commonly used functions. Our approach replaces the linear predictor of the generalized partial credit model with a monotonic polynomial. The model includes the regular generalized partial credit model at the lowest order polynomial. Our approach extends Liang's (A semi-parametric approach to estimate IRFs, Unpublished doctoral dissertation, 2007) method for dichotomous item responses to the case of polytomous data. Furthermore, item parameter estimation is implemented with maximum marginal likelihood using the Bock-Aitkin EM algorithm, thereby facilitating multiple group analyses useful in operational settings. Our approach is demonstrated on both educational and psychological data. We present simulation results comparing our approach to more standard IRF estimation approaches and other non-parametric and semi-parametric alternatives.
Causal inference and the data-fusion problem
Bareinboim, Elias; Pearl, Judea
2016-01-01
We review concepts, principles, and tools that unify current approaches to causal analysis and attend to new challenges presented by big data. In particular, we address the problem of data fusion—piecing together multiple datasets collected under heterogeneous conditions (i.e., different populations, regimes, and sampling methods) to obtain valid answers to queries of interest. The availability of multiple heterogeneous datasets presents new opportunities to big data analysts, because the knowledge that can be acquired from combined data would not be possible from any individual source alone. However, the biases that emerge in heterogeneous environments require new analytical tools. Some of these biases, including confounding, sampling selection, and cross-population biases, have been addressed in isolation, largely in restricted parametric models. We here present a general, nonparametric framework for handling these biases and, ultimately, a theoretical solution to the problem of data fusion in causal inference tasks. PMID:27382148
Variance based joint sparsity reconstruction of synthetic aperture radar data for speckle reduction
NASA Astrophysics Data System (ADS)
Scarnati, Theresa; Gelb, Anne
2018-04-01
In observing multiple synthetic aperture radar (SAR) images of the same scene, it is apparent that the brightness distributions of the images are not smooth, but rather composed of complicated granular patterns of bright and dark spots. Further, these brightness distributions vary from image to image. This salt and pepper like feature of SAR images, called speckle, reduces the contrast in the images and negatively affects texture based image analysis. This investigation uses the variance based joint sparsity reconstruction method for forming SAR images from the multiple SAR images. In addition to reducing speckle, the method has the advantage of being non-parametric, and can therefore be used in a variety of autonomous applications. Numerical examples include reconstructions of both simulated phase history data that result in speckled images as well as the images from the MSTAR T-72 database.
Traffic flow forecasting using approximate nearest neighbor nonparametric regression
DOT National Transportation Integrated Search
2000-12-01
The purpose of this research is to enhance nonparametric regression (NPR) for use in real-time systems by first reducing execution time using advanced data structures and imprecise computations and then developing a methodology for applying NPR. Due ...
Evaluation of procedures for prediction of unconventional gas in the presence of geologic trends
Attanasi, E.D.; Coburn, T.C.
2009-01-01
This study extends the application of local spatial nonparametric prediction models to the estimation of recoverable gas volumes in continuous-type gas plays to regimes where there is a single geologic trend. A transformation is presented, originally proposed by Tomczak, that offsets the distortions caused by the trend. This article reports on numerical experiments that compare predictive and classification performance of the local nonparametric prediction models based on the transformation with models based on Euclidean distance. The transformation offers improvement in average root mean square error when the trend is not severely misspecified. Because of the local nature of the models, even those based on Euclidean distance in the presence of trends are reasonably robust. The tests based on other model performance metrics such as prediction error associated with the high-grade tracts and the ability of the models to identify sites with the largest gas volumes also demonstrate the robustness of both local modeling approaches. ?? International Association for Mathematical Geology 2009.
Illiquidity premium and expected stock returns in the UK: A new approach
NASA Astrophysics Data System (ADS)
Chen, Jiaqi; Sherif, Mohamed
2016-09-01
This study examines the relative importance of liquidity risk for the time-series and cross-section of stock returns in the UK. We propose a simple way to capture the multidimensionality of illiquidity. Our analysis indicates that existing illiquidity measures have considerable asset specific components, which justifies our new approach. Further, we use an alternative test of the Amihud (2002) measure and parametric and non-parametric methods to investigate whether liquidity risk is priced in the UK. We find that the inclusion of the illiquidity factor in the capital asset pricing model plays a significant role in explaining the cross-sectional variation in stock returns, in particular with the Fama-French three-factor model. Further, using Hansen-Jagannathan non-parametric bounds, we find that the illiquidity-augmented capital asset pricing models yield a small distance error, other non-liquidity based models fail to yield economically plausible distance values. Our findings have important implications for managing the liquidity risk of equity portfolios.
Nonparametric Determination of Redshift Evolution Index of Dark Energy
NASA Astrophysics Data System (ADS)
Ziaeepour, Houri
We propose a nonparametric method to determine the sign of γ — the redshift evolution index of dark energy. This is important for distinguishing between positive energy models, a cosmological constant, and what is generally called ghost models. Our method is based on geometrical properties and is more tolerant to uncertainties of other cosmological parameters than fitting methods in what concerns the sign of γ. The same parametrization can also be used for determining γ and its redshift dependence by fitting. We apply this method to SNLS supernovae and to gold sample of re-analyzed supernovae data from Riess et al. Both datasets show strong indication of a negative γ. If this result is confirmed by more extended and precise data, many of the dark energy models, including simple cosmological constant, standard quintessence models without interaction between quintessence scalar field(s) and matter, and scaling models are ruled out. We have also applied this method to Gurzadyan-Xue models with varying fundamental constants to demonstrate the possibility of using it to test other cosmologies.
Bayesian approach for three-dimensional aquifer characterization at the Hanford 300 Area
DOE Office of Scientific and Technical Information (OSTI.GOV)
Murakami, Haruko; Chen, X.; Hahn, Melanie S.
2010-10-21
This study presents a stochastic, three-dimensional characterization of a heterogeneous hydraulic conductivity field within DOE's Hanford 300 Area site, Washington, by assimilating large-scale, constant-rate injection test data with small-scale, three-dimensional electromagnetic borehole flowmeter (EBF) measurement data. We first inverted the injection test data to estimate the transmissivity field, using zeroth-order temporal moments of pressure buildup curves. We applied a newly developed Bayesian geostatistical inversion framework, the method of anchored distributions (MAD), to obtain a joint posterior distribution of geostatistical parameters and local log-transmissivities at multiple locations. The unique aspects of MAD that make it suitable for this purpose are itsmore » ability to integrate multi-scale, multi-type data within a Bayesian framework and to compute a nonparametric posterior distribution. After we combined the distribution of transmissivities with depth-discrete relative-conductivity profile from EBF data, we inferred the three-dimensional geostatistical parameters of the log-conductivity field, using the Bayesian model-based geostatistics. Such consistent use of the Bayesian approach throughout the procedure enabled us to systematically incorporate data uncertainty into the final posterior distribution. The method was tested in a synthetic study and validated using the actual data that was not part of the estimation. Results showed broader and skewed posterior distributions of geostatistical parameters except for the mean, which suggests the importance of inferring the entire distribution to quantify the parameter uncertainty.« less
Guo, Yanrong; Gao, Yaozong; Shao, Yeqin; Price, True; Oto, Aytekin; Shen, Dinggang
2014-01-01
Purpose: Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integrate the appearance model into a deformable segmentation framework for prostate MR segmentation. Methods: To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. Results: The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. Conclusions: A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images. PMID:24989402
Guo, Yanrong; Gao, Yaozong; Shao, Yeqin; Price, True; Oto, Aytekin; Shen, Dinggang
2014-07-01
Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integrate the appearance model into a deformable segmentation framework for prostate MR segmentation. To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images.
Comparison of serum Concentration of Se, Pb, Mg, Cu, Zn, between MS patients and healthy controls
Alizadeh, Anahita; Mehrpour, Omid; Nikkhah, Karim; Bayat, Golnaz; Espandani, Mahsa; Golzari, Alireza; Jarahi, Lida; Foroughipour, Mohsen
2016-01-01
Introduction Multiple Sclerosis (MS) is defined as one of the inflammatory autoimmune disorders and is common. Its exact etiology is unclear. There are some evidences on the role of environmental factors in susceptible genetics. The aim of this study is to evaluate the possible role of Selenium, Zinc, Copper, Lead and Magnesium metals in Multiple Sclerosis patients. Methods In the present analytical cross-sectional study, 56 individuals including 26 patients and 30 healthy controls were enrolled in the evaluation. The serum level of Se, Zn, Cu, Pb were quantified in graphite furnace conditions and flame conditions by utilizing an atomic absorption Perkin Elmer spectrophotometer 3030. The serum levels of Mg were measured by auto analyzer 1500 BT. The mean level of minerals (Zn, Pb, Cu, Mg, Se) in serum samples were compared in both cases and controls. The mean level of minerals (Zn, Pb, Cu, Mg, Se) in serum samples were compared in both cases and controls by using independent-samples t-test for normal distribution and Mann-Whitney U test as a non-parametric test. All statistical analyses were carried out using SPSS 11.0. Results As well as the Zn, Cu, and Se, there was no significant difference between MS patients and healthy individuals in Pb concentrations (p-value = 0.11, 0.14, 0.32, 0.20 respectively) but the level of Mg was significantly different (p= 0.001). Conclusion All serum concentrations of Zn, Pb, Se, Cu in both groups were in normal ranges and there was no difference in MS patients compared with the healthy group who were matched in genetics. Blood level of Mg was significantly lower in MS patients. But it should be noted that even with the low level of serum magnesium in MS patients, this value is still in the normal range. PMID:27757186
Nonparametric Transfer Function Models
Liu, Jun M.; Chen, Rong; Yao, Qiwei
2009-01-01
In this paper a class of nonparametric transfer function models is proposed to model nonlinear relationships between ‘input’ and ‘output’ time series. The transfer function is smooth with unknown functional forms, and the noise is assumed to be a stationary autoregressive-moving average (ARMA) process. The nonparametric transfer function is estimated jointly with the ARMA parameters. By modeling the correlation in the noise, the transfer function can be estimated more efficiently. The parsimonious ARMA structure improves the estimation efficiency in finite samples. The asymptotic properties of the estimators are investigated. The finite-sample properties are illustrated through simulations and one empirical example. PMID:20628584
Sengupta Chattopadhyay, Amrita; Hsiao, Ching-Lin; Chang, Chien Ching; Lian, Ie-Bin; Fann, Cathy S J
2014-01-01
Identifying susceptibility genes that influence complex diseases is extremely difficult because loci often influence the disease state through genetic interactions. Numerous approaches to detect disease-associated SNP-SNP interactions have been developed, but none consistently generates high-quality results under different disease scenarios. Using summarizing techniques to combine a number of existing methods may provide a solution to this problem. Here we used three popular non-parametric methods-Gini, absolute probability difference (APD), and entropy-to develop two novel summary scores, namely principle component score (PCS) and Z-sum score (ZSS), with which to predict disease-associated genetic interactions. We used a simulation study to compare performance of the non-parametric scores, the summary scores, the scaled-sum score (SSS; used in polymorphism interaction analysis (PIA)), and the multifactor dimensionality reduction (MDR). The non-parametric methods achieved high power, but no non-parametric method outperformed all others under a variety of epistatic scenarios. PCS and ZSS, however, outperformed MDR. PCS, ZSS and SSS displayed controlled type-I-errors (<0.05) compared to GS, APDS, ES (>0.05). A real data study using the genetic-analysis-workshop 16 (GAW 16) rheumatoid arthritis dataset identified a number of interesting SNP-SNP interactions. © 2013 Elsevier B.V. All rights reserved.
Quintela-del-Río, Alejandro; Francisco-Fernández, Mario
2011-02-01
The study of extreme values and prediction of ozone data is an important topic of research when dealing with environmental problems. Classical extreme value theory is usually used in air-pollution studies. It consists in fitting a parametric generalised extreme value (GEV) distribution to a data set of extreme values, and using the estimated distribution to compute return levels and other quantities of interest. Here, we propose to estimate these values using nonparametric functional data methods. Functional data analysis is a relatively new statistical methodology that generally deals with data consisting of curves or multi-dimensional variables. In this paper, we use this technique, jointly with nonparametric curve estimation, to provide alternatives to the usual parametric statistical tools. The nonparametric estimators are applied to real samples of maximum ozone values obtained from several monitoring stations belonging to the Automatic Urban and Rural Network (AURN) in the UK. The results show that nonparametric estimators work satisfactorily, outperforming the behaviour of classical parametric estimators. Functional data analysis is also used to predict stratospheric ozone concentrations. We show an application, using the data set of mean monthly ozone concentrations in Arosa, Switzerland, and the results are compared with those obtained by classical time series (ARIMA) analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.
A semi-nonparametric Poisson regression model for analyzing motor vehicle crash data.
Ye, Xin; Wang, Ke; Zou, Yajie; Lord, Dominique
2018-01-01
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.
Borgquist, Ola; Wise, Matt P; Nielsen, Niklas; Al-Subaie, Nawaf; Cranshaw, Julius; Cronberg, Tobias; Glover, Guy; Hassager, Christian; Kjaergaard, Jesper; Kuiper, Michael; Smid, Ondrej; Walden, Andrew; Friberg, Hans
2017-08-01
Dysglycemia and glycemic variability are associated with poor outcomes in critically ill patients. Targeted temperature management alters blood glucose homeostasis. We investigated the association between blood glucose concentrations and glycemic variability and the neurologic outcomes of patients randomized to targeted temperature management at 33°C or 36°C after cardiac arrest. Post hoc analysis of the multicenter TTM-trial. Primary outcome of this analysis was neurologic outcome after 6 months, referred to as "Cerebral Performance Category." Thirty-six sites in Europe and Australia. All 939 patients with out-of-hospital cardiac arrest of presumed cardiac cause that had been included in the TTM-trial. Targeted temperature management at 33°C or 36°C. Nonparametric tests as well as multiple logistic regression and mixed effects logistic regression models were used. Median glucose concentrations on hospital admission differed significantly between Cerebral Performance Category outcomes (p < 0.0001). Hyper- and hypoglycemia were associated with poor neurologic outcome (p = 0.001 and p = 0.054). In the multiple logistic regression models, the median glycemic level was an independent predictor of poor Cerebral Performance Category (Cerebral Performance Category, 3-5) with an odds ratio (OR) of 1.13 in the adjusted model (p = 0.008; 95% CI, 1.03-1.24). It was also a predictor in the mixed model, which served as a sensitivity analysis to adjust for the multiple time points. The proportion of hyperglycemia was higher in the 33°C group compared with the 36°C group. Higher blood glucose levels at admission and during the first 36 hours, and higher glycemic variability, were associated with poor neurologic outcome and death. More patients in the 33°C treatment arm had hyperglycemia.
Shelley, Casey L; Berry, Stepheny; Howard, James; De Ruyter, Martin; Thepthepha, Melissa; Nazir, Niaman; McDonald, Tracy; Dalton, Annemarie; Moncure, Michael
2016-09-01
Rib fractures are common in trauma admissions and are associated with an increased risk of pulmonary complications, intensive care unit admissions, and mortality. Providing adequate pain control in patients with multiple rib fractures decreases the risk of adverse events. Thoracic epidural analgesia is currently the preferred method for pain control. This study compared outcomes in patients with multiple acute rib fractures treated with posterior paramedian subrhomboidal (PoPS) analgesia versus thoracic epidural analgesia (TEA). This prospective study included 30 patients with three or more acute rib fractures admitted to a Level I trauma center. Thoracic epidural analgesia or PoPS catheters were placed, and local anesthesia was infused. Data were collected including patients' pain level, adjunct morphine equivalent use, adverse events, length of stay, lung volumes, and discharge disposition. Nonparametric tests were used and two-sided p < 0.05 were considered statistically significant. Nineteen (63%) of 30 patients received TEA and 11 (37%) of 30 patients received PoPS. Pain rating was lower in the PoPS group (2.5 vs. 5; p = 0.03) after initial placement. Overall, there was no other statistically significant difference in pain control or use of oral morphine adjuncts between the groups. Hypotension occurred in eight patients, 75% with TEA and only 25% with PoPS. No difference was found in adverse events, length of stay, lung volumes, or discharge disposition. In patients with rib fractures, PoPS analgesia may provide pain control equivalent to TEA while being less invasive and more readily placed by a variety of hospital staff. This pilot study is limited by its small sample size, and therefore additional studies are needed to prove equivalence of PoPS compared to TEA. Therapeutic study, level IV.
Buzayan, Muaiyed; Baig, Mirza Rustum; Yunus, Norsiah
2013-01-01
This in vitro study evaluated the accuracy of multiple-unit dental implant casts obtained from splinted or nonsplinted direct impression techniques using various splinting materials by comparing the casts to the reference models. The effect of two different impression materials on the accuracy of the implant casts was also evaluated for abutment-level impressions. A reference model with six internal-connection implant replicas placed in the completely edentulous mandibular arch and connected to multi-base abutments was fabricated from heat-curing acrylic resin. Forty impressions of the reference model were made, 20 each with polyether (PE) and polyvinylsiloxane (PVS) impression materials using the open tray technique. The PE and PVS groups were further subdivided into four subgroups of five each on the bases of splinting type: no splinting, bite registration PE, bite registration addition silicone, or autopolymerizing acrylic resin. The positional accuracy of the implant replica heads was measured on the poured casts using a coordinate measuring machine to assess linear differences in interimplant distances in all three axes. The collected data (linear and three-dimensional [3D] displacement values) were compared with the measurements calculated on the reference resin model and analyzed with nonparametric tests (Kruskal-Wallis and Mann-Whitney). No significant differences were found between the various splinting groups for both PE and PVS impression materials in terms of linear and 3D distortions. However, small but significant differences were found between the two impression materials (PVS, 91 μm; PE, 103 μm) in terms of 3D discrepancies, irrespective of the splinting technique employed. Casts obtained from both impression materials exhibited differences from the reference model. The impression material influenced impression inaccuracy more than the splinting material for multiple-unit abutment-level impressions.
Jang, Si Young; Lalonde, Ron; Ozhasoglu, Cihat; Burton, Steven; Heron, Dwight; Huq, M Saiful
2016-09-08
We performed an evaluation of the CyberKnife InCise MLC by comparing plan qualities for single and multiple brain lesions generated using the first version of InCise MLC, fixed cone, and Iris collimators. We also investigated differences in delivery efficiency among the three collimators. Twenty-four patients with single or multiple brain mets treated previously in our clinic on a CyberKnife M6 using cone/Iris collimators were selected for this study. Treatment plans were generated for all lesions using the InCise MLC. Number of monitor units, delivery time, target coverage, conformity index, and dose falloff were compared between MLC- and clinical cone/Iris-based plans. Statistical analysis was performed using the non-parametric Wilcoxon-Mann-Whitney signed-rank test. The planning accuracy of the MLC-based plans was validated using chamber and film measurements. The InCise MLC-based plans achieved mean dose and target coverage comparable to the cone/Iris-based plans. Although the conformity indices of the MLC-based plans were slightly higher than those of the cone/Iris-based plans, beam delivery time for the MLC-based plans was shorter by 30% ~ 40%. For smaller targets or cases with OARs located close to or abutting target volumes, MLC-based plans provided inferior dose conformity compared to cone/Iris-based plans. The QA results of MLC-based plans were within 5% absolute dose difference with over 90% gamma passing rate using 2%/2 mm gamma criteria. The first version of InCise MLC could be a useful delivery modality, especially for clinical situations for which delivery time is a limiting factor or for multitarget cases. © 2016 The Authors.
NASA Astrophysics Data System (ADS)
Thomas, M. A.
2016-12-01
The Waste Isolation Pilot Plant (WIPP) is the only deep geological repository for transuranic waste in the United States. As the Science Advisor for the WIPP, Sandia National Laboratories annually evaluates site data against trigger values (TVs), metrics whose violation is indicative of conditions that may impact long-term repository performance. This study focuses on a groundwater-quality dataset used to redesign a TV for the Culebra Dolomite Member (Culebra) of the Permian-age Rustler Formation. Prior to this study, a TV violation occurred if the concentration of a major ion fell outside a range defined as the mean +/- two standard deviations. The ranges were thought to denote conditions that 95% of future values would fall within. Groundwater-quality data used in evaluating compliance, however, are rarely normally distributed. To create a more robust Culebra groundwater-quality TV, this study employed the randomization test, a non-parametric permutation method. Recent groundwater compositions considered TV violations under the original ion concentration ranges are now interpreted as false positives in light of the insignificant p-values calculated with the randomization test. This work highlights that the normality assumption can weaken as the size of a groundwater-quality dataset grows over time. Non-parametric permutation methods are an attractive option because no assumption about the statistical distribution is required and calculating all combinations of the data is an increasingly tractable problem with modern workstations. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. This research is funded by WIPP programs administered by the Office of Environmental Management (EM) of the U.S. Department of Energy. SAND2016-7306A
Chen, Gang; Taylor, Paul A.; Shin, Yong-Wook; Reynolds, Richard C.; Cox, Robert W.
2016-01-01
It has been argued that naturalistic conditions in FMRI studies provide a useful paradigm for investigating perception and cognition through a synchronization measure, inter-subject correlation (ISC). However, one analytical stumbling block has been the fact that the ISC values associated with each single subject are not independent, and our previous paper (Chen et al., 2016) used simulations and analyses of real data to show that the methodologies adopted in the literature do not have the proper control for false positives. In the same paper, we proposed nonparametric subject-wise bootstrapping and permutation testing techniques for one and two groups, respectively, which account for the correlation structure, and these greatly outperformed the prior methods in controlling the false positive rate (FPR); that is, subject-wise bootstrapping (SWB) worked relatively well for both cases with one and two groups, and subject-wise permutation (SWP) testing was virtually ideal for group comparisons. Here we seek to explicate and adopt a parametric approach through linear mixed-effects (LME) modeling for studying the ISC values, building on the previous correlation framework, with the benefit that the LME platform offers wider adaptability, more powerful interpretations, and quality control checking capability than nonparametric methods. We describe both theoretical and practical issues involved in the modeling and the manner in which LME with crossed random effects (CRE) modeling is applied. A data-doubling step further allows us to conveniently track the subject index, and achieve easy implementations. We pit the LME approach against the best nonparametric methods, and find that the LME framework achieves proper control for false positives. The new LME methodologies are shown to be both efficient and robust, and they will be added as an additional option and settings in an existing open source program, 3dLME, in AFNI (http://afni.nimh.nih.gov). PMID:27751943
Eslamian, L; Gholami, H; Mortazavi, S A R; Soheilifar, S
2016-11-01
To compare the effectiveness of 5% benzocaine gel and placebo gel on reducing pain caused by fixed orthodontic appliance activation. Thirty subjects (15-25 years) undergoing fixed orthodontics. A randomized, double-blind, placebo-controlled and cross-over clinical trial study was conducted. Subjects were asked to apply a placebo gel and 5% benzocaine gel, exchangeable in two consecutive appointments, twice a day for 3 days and mark their level of pain on a VAS scale. The pain severity was evaluated by means of Mann-Whitney U-test for comparing two gel groups, Kruskal-Wallis nonparametric test for overall differences and post hoc test of Dunnett for paired multiple comparisons. p-value was assigned <0.05. The overall mean value of pain intensity for benzocaine and placebo gels was 0.89 and 1.15, respectively. The Mann-Whitney U-test indicated that there was no significant difference between overall pain in both groups (mean difference = 0.258 p ˂ 0.21). For both groups, pain intensity was significantly lower at 2, 6 and 24 h compared with pain experienced at days 2, 3 and 7. Benzocaine gel caused a decrease in pain perception at 2 h compared with placebo gel. Peak pain intensity was at 2 h for placebo gel and at 6 h for benzocaine gel, followed by a decline in pain perception from that point to day 7 for both gels. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Verification of forecast ensembles in complex terrain including observation uncertainty
NASA Astrophysics Data System (ADS)
Dorninger, Manfred; Kloiber, Simon
2017-04-01
Traditionally, verification means to verify a forecast (ensemble) with the truth represented by observations. The observation errors are quite often neglected arguing that they are small when compared to the forecast error. In this study as part of the MesoVICT (Mesoscale Verification Inter-comparison over Complex Terrain) project it will be shown, that observation errors have to be taken into account for verification purposes. The observation uncertainty is estimated from the VERA (Vienna Enhanced Resolution Analysis) and represented via two analysis ensembles which are compared to the forecast ensemble. For the whole study results from COSMO-LEPS provided by Arpae-SIMC Emilia-Romagna are used as forecast ensemble. The time period covers the MesoVICT core case from 20-22 June 2007. In a first step, all ensembles are investigated concerning their distribution. Several tests have been executed (Kolmogorov-Smirnov-Test, Finkelstein-Schafer Test, Chi-Square Test etc.) showing no exact mathematical distribution. So the main focus is on non-parametric statistics (e.g. Kernel density estimation, Boxplots etc.) and also the deviation between "forced" normal distributed data and the kernel density estimations. In a next step the observational deviations due to the analysis ensembles are analysed. In a first approach scores are multiple times calculated with every single ensemble member from the analysis ensemble regarded as "true" observation. The results are presented as boxplots for the different scores and parameters. Additionally, the bootstrapping method is also applied to the ensembles. These possible approaches to incorporating observational uncertainty into the computation of statistics will be discussed in the talk.
Heilmann, Romy M; Grellet, Aurélien; Grützner, Niels; Cranford, Shannon M; Suchodolski, Jan S; Chastant-Maillard, Sylvie; Steiner, Jörg M
2018-04-17
Previous data suggest that fecal S100A12 has clinical utility as a biomarker of chronic gastrointestinal inflammation (idiopathic inflammatory bowel disease) in both people and dogs, but the effect of gastrointestinal pathogens on fecal S100A12 concentrations is largely unknown. The role of S100A12 in parasite and viral infections is also difficult to study in traditional animal models due to the lack of S100A12 expression in rodents. Thus, the aim of this study was to evaluate fecal S100A12 concentrations in a cohort of puppies with intestinal parasites (Cystoisospora spp., Toxocara canis, Giardia sp.) and viral agents that are frequently encountered and known to cause gastrointestinal signs in dogs (coronavirus, parvovirus) as a comparative model. Spot fecal samples were collected from 307 puppies [median age (range): 7 (4-13) weeks; 29 different breeds] in French breeding kennels, and fecal scores (semiquantitative system; scores 1-13) were assigned. Fecal samples were tested for Cystoisospora spp. (C. canis and C. ohioensis), Toxocara canis, Giardia sp., as well as canine coronavirus (CCV) and parvovirus (CPV). S100A12 concentrations were measured in all fecal samples using an in-house radioimmunoassay. Statistical analyses were performed using non-parametric 2-group or multiple-group comparisons, non-parametric correlation analysis, association testing between nominal variables, and construction of a multivariate mixed model. Fecal S100A12 concentrations ranged from < 24-14,363 ng/g. Univariate analysis only showed increased fecal S100A12 concentrations in dogs shedding Cystoisospora spp. (P = 0.0384) and in dogs infected with parvovirus (P = 0.0277), whereas dogs infected with coronavirus had decreased fecal S100A12 concentrations (P = 0.0345). However, shedding of any single enteropathogen did not affect fecal S100A12 concentrations in multivariate analysis (all P > 0.05) in this study. Only fecal score and breed size had an effect on fecal S100A12 concentrations in multivariate analysis (P < 0.0001). An infection with any single enteropathogen tested in this study is unlikely to alter fecal S100A12 concentrations, and these preliminary data are important for further studies evaluating fecal S100A12 concentrations in dogs or when using fecal S100A12 concentrations as a biomarker in patients with chronic idiopathic gastrointestinal inflammation.
Comparison of bootstrap approaches for estimation of uncertainties of DTI parameters.
Chung, SungWon; Lu, Ying; Henry, Roland G
2006-11-01
Bootstrap is an empirical non-parametric statistical technique based on data resampling that has been used to quantify uncertainties of diffusion tensor MRI (DTI) parameters, useful in tractography and in assessing DTI methods. The current bootstrap method (repetition bootstrap) used for DTI analysis performs resampling within the data sharing common diffusion gradients, requiring multiple acquisitions for each diffusion gradient. Recently, wild bootstrap was proposed that can be applied without multiple acquisitions. In this paper, two new approaches are introduced called residual bootstrap and repetition bootknife. We show that repetition bootknife corrects for the large bias present in the repetition bootstrap method and, therefore, better estimates the standard errors. Like wild bootstrap, residual bootstrap is applicable to single acquisition scheme, and both are based on regression residuals (called model-based resampling). Residual bootstrap is based on the assumption that non-constant variance of measured diffusion-attenuated signals can be modeled, which is actually the assumption behind the widely used weighted least squares solution of diffusion tensor. The performances of these bootstrap approaches were compared in terms of bias, variance, and overall error of bootstrap-estimated standard error by Monte Carlo simulation. We demonstrate that residual bootstrap has smaller biases and overall errors, which enables estimation of uncertainties with higher accuracy. Understanding the properties of these bootstrap procedures will help us to choose the optimal approach for estimating uncertainties that can benefit hypothesis testing based on DTI parameters, probabilistic fiber tracking, and optimizing DTI methods.
Thirty Years of Nonparametric Item Response Theory.
ERIC Educational Resources Information Center
Molenaar, Ivo W.
2001-01-01
Discusses relationships between a mathematical measurement model and its real-world applications. Makes a distinction between large-scale data matrices commonly found in educational measurement and smaller matrices found in attitude and personality measurement. Also evaluates nonparametric methods for estimating item response functions and…
2012-01-01
Background Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Methods Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. Results and conclusions After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12) – when binary scored – were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech’s “well-being” and “distress” clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension. Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware. PMID:22686586
Stochl, Jan; Jones, Peter B; Croudace, Tim J
2012-06-11
Mokken scaling techniques are a useful tool for researchers who wish to construct unidimensional tests or use questionnaires that comprise multiple binary or polytomous items. The stochastic cumulative scaling model offered by this approach is ideally suited when the intention is to score an underlying latent trait by simple addition of the item response values. In our experience, the Mokken model appears to be less well-known than for example the (related) Rasch model, but is seeing increasing use in contemporary clinical research and public health. Mokken's method is a generalisation of Guttman scaling that can assist in the determination of the dimensionality of tests or scales, and enables consideration of reliability, without reliance on Cronbach's alpha. This paper provides a practical guide to the application and interpretation of this non-parametric item response theory method in empirical research with health and well-being questionnaires. Scalability of data from 1) a cross-sectional health survey (the Scottish Health Education Population Survey) and 2) a general population birth cohort study (the National Child Development Study) illustrate the method and modeling steps for dichotomous and polytomous items respectively. The questionnaire data analyzed comprise responses to the 12 item General Health Questionnaire, under the binary recoding recommended for screening applications, and the ordinal/polytomous responses to the Warwick-Edinburgh Mental Well-being Scale. After an initial analysis example in which we select items by phrasing (six positive versus six negatively worded items) we show that all items from the 12-item General Health Questionnaire (GHQ-12)--when binary scored--were scalable according to the double monotonicity model, in two short scales comprising six items each (Bech's "well-being" and "distress" clinical scales). An illustration of ordinal item analysis confirmed that all 14 positively worded items of the Warwick-Edinburgh Mental Well-being Scale (WEMWBS) met criteria for the monotone homogeneity model but four items violated double monotonicity with respect to a single underlying dimension.Software availability and commands used to specify unidimensionality and reliability analysis and graphical displays for diagnosing monotone homogeneity and double monotonicity are discussed, with an emphasis on current implementations in freeware.
1987-09-01
long been recognized as powerful nonparametric statistical methods since the introduction of the principal ideas by R.A. Fisher in 1935 . Even when...couldIatal eoand1rmSncepriet ::’x.OUld st:ll have to he - epre -erte-.. I-Itma’v in any: ph,:sIcal computing devl’:c by a C\\onux ot bit,Aa n. the
Nonparametric Bayesian Context Learning for Buried Threat Detection
2012-01-01
scans of field-collected GPR data collected on dirt (top) and gravel (bottom) lanes at an Eastern US test site. . . . . . . . . . 47 2.11 Plot of ...partition the data into M components, they differ in the information used to partition the data. First, approaches that as- sume independence of observations...and the advantages and disadvantages of using each are discussed. Furthermore, the merits of incorporating spatial information are also highlighted
A new approach to evaluate gamma-ray measurements
NASA Technical Reports Server (NTRS)
Dejager, O. C.; Swanepoel, J. W. H.; Raubenheimer, B. C.; Vandervalt, D. J.
1985-01-01
Misunderstandings about the term random samples its implications may easily arise. Conditions under which the phases, obtained from arrival times, do not form a random sample and the dangers involved are discussed. Watson's U sup 2 test for uniformity is recommended for light curves with duty cycles larger than 10%. Under certain conditions, non-parametric density estimation may be used to determine estimates of the true light curve and its parameters.
Nursing Students’ Smoking Behaviors and Smoking-Related Self-Concept
2003-04-01
The purposes of this pilot study were to describe: (a) the relationships between baccalaureate (BSN) nursing students’ smoking-related current self ...instruments used to describe nursing students’ self -concept, including current smoking-related self -concept and possible selves. A schema model of smoking...collected to gather data on demographics, smoking history, and current self and possible future selves. Nonparametric tests were used to describe group
ERIC Educational Resources Information Center
Cela-Ranilla, Jose María; Esteve-Gonzalez, Vanessa; Esteve-Mon, Francesc; Gisbert-Cervera, Merce
2014-01-01
In this study we analyze how 57 Spanish university students of Education developed a learning process in a virtual world by conducting activities that involved the skill of self-management. The learning experience comprised a serious game designed in a 3D simulation environment. Descriptive statistics and non-parametric tests were used in the…
NASA Astrophysics Data System (ADS)
Lototzis, M.; Papadopoulos, G. K.; Droulia, F.; Tseliou, A.; Tsiros, I. X.
2018-04-01
There are several cases where a circular variable is associated with a linear one. A typical example is wind direction that is often associated with linear quantities such as air temperature and air humidity. The analysis of a statistical relationship of this kind can be tested by the use of parametric and non-parametric methods, each of which has its own advantages and drawbacks. This work deals with correlation analysis using both the parametric and the non-parametric procedure on a small set of meteorological data of air temperature and wind direction during a summer period in a Mediterranean climate. Correlations were examined between hourly, daily and maximum-prevailing values, under typical and non-typical meteorological conditions. Both tests indicated a strong correlation between mean hourly wind directions and mean hourly air temperature, whereas mean daily wind direction and mean daily air temperature do not seem to be correlated. In some cases, however, the two procedures were found to give quite dissimilar levels of significance on the rejection or not of the null hypothesis of no correlation. The simple statistical analysis presented in this study, appropriately extended in large sets of meteorological data, may be a useful tool for estimating effects of wind on local climate studies.
Two-sample tests and one-way MANOVA for multivariate biomarker data with nondetects.
Thulin, M
2016-09-10
Testing whether the mean vector of a multivariate set of biomarkers differs between several populations is an increasingly common problem in medical research. Biomarker data is often left censored because some measurements fall below the laboratory's detection limit. We investigate how such censoring affects multivariate two-sample and one-way multivariate analysis of variance tests. Type I error rates, power and robustness to increasing censoring are studied, under both normality and non-normality. Parametric tests are found to perform better than non-parametric alternatives, indicating that the current recommendations for analysis of censored multivariate data may have to be revised. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Disordered eating behavior and mental health correlates among treatment seeking obese women.
Altamura, M; Rossi, G; Aquilano, P; De Fazio, P; Segura-Garcia, C; Rossetti, M; Petrone, A; Lo Russo, T; Vendemiale, G; Bellomo, A
2015-01-01
Previous research has suggest that obesity is associated with increased risk for psychopathological disorders, however, little is known about which obese patients are most vulnerable to psychopathological disorders. We therefore investigated 126 treatment-seeking obese women to describe eating disorder pathology and mental health correlates, and to identify disordered eating behaviors that may place obese at increased risk for psychopathological disorders. The Structured Clinical Interview for DSM-IV (SCID) was used to identify Eating Disorders (ED). A battery of psychological tests, including the Anxiety Scale Questionnaire (ASQ,) Clinical Depression Questionnaire (CDQ), Eating Disorder Inventory-2 (EDI-2) Eating Attitudes Test-26 (EAT-26) scales and structured clinical interview were administered to all the patients. We analyzed the link between psychopathological disorders and eating attitudes by using both multiple regression analysis and non-parametric correlation. Disordered eating behaviors and emotional behavioral aspects related to Anorexia Nervosa, such as ineffectiveness, are strongly linked to the depression and anxiety in obese subjects. No correlation was found between psychopathological disorders and age or anthropometric measurements. Findings corroborate earlier work indicating that psychological distress is elevated in obese treatment seeking, bolstering the need for mental health assessment of such individuals. The feeling of ineffectiveness constitutes the major predictor of psychopathological aspects. This is an important result which may inform the development of effective interventions for obese patients and prevention of psychopathological disorders.
Applying causal mediation analysis to personality disorder research.
Walters, Glenn D
2018-01-01
This article is designed to address fundamental issues in the application of causal mediation analysis to research on personality disorders. Causal mediation analysis is used to identify mechanisms of effect by testing variables as putative links between the independent and dependent variables. As such, it would appear to have relevance to personality disorder research. It is argued that proper implementation of causal mediation analysis requires that investigators take several factors into account. These factors are discussed under 5 headings: variable selection, model specification, significance evaluation, effect size estimation, and sensitivity testing. First, care must be taken when selecting the independent, dependent, mediator, and control variables for a mediation analysis. Some variables make better mediators than others and all variables should be based on reasonably reliable indicators. Second, the mediation model needs to be properly specified. This requires that the data for the analysis be prospectively or historically ordered and possess proper causal direction. Third, it is imperative that the significance of the identified pathways be established, preferably with a nonparametric bootstrap resampling approach. Fourth, effect size estimates should be computed or competing pathways compared. Finally, investigators employing the mediation method are advised to perform a sensitivity analysis. Additional topics covered in this article include parallel and serial multiple mediation designs, moderation, and the relationship between mediation and moderation. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Nonparametric Regression and the Parametric Bootstrap for Local Dependence Assessment.
ERIC Educational Resources Information Center
Habing, Brian
2001-01-01
Discusses ideas underlying nonparametric regression and the parametric bootstrap with an overview of their application to item response theory and the assessment of local dependence. Illustrates the use of the method in assessing local dependence that varies with examinee trait levels. (SLD)
NASA Astrophysics Data System (ADS)
Liao, Meng; To, Quy-Dong; Léonard, Céline; Monchiet, Vincent
2018-03-01
In this paper, we use the molecular dynamics simulation method to study gas-wall boundary conditions. Discrete scattering information of gas molecules at the wall surface is obtained from collision simulations. The collision data can be used to identify the accommodation coefficients for parametric wall models such as Maxwell and Cercignani-Lampis scattering kernels. Since these scattering kernels are based on a limited number of accommodation coefficients, we adopt non-parametric statistical methods to construct the kernel to overcome these issues. Different from parametric kernels, the non-parametric kernels require no parameter (i.e. accommodation coefficients) and no predefined distribution. We also propose approaches to derive directly the Navier friction and Kapitza thermal resistance coefficients as well as other interface coefficients associated with moment equations from the non-parametric kernels. The methods are applied successfully to systems composed of CH4 or CO2 and graphite, which are of interest to the petroleum industry.
Nonparametric Simulation of Signal Transduction Networks with Semi-Synchronized Update
Nassiri, Isar; Masoudi-Nejad, Ali; Jalili, Mahdi; Moeini, Ali
2012-01-01
Simulating signal transduction in cellular signaling networks provides predictions of network dynamics by quantifying the changes in concentration and activity-level of the individual proteins. Since numerical values of kinetic parameters might be difficult to obtain, it is imperative to develop non-parametric approaches that combine the connectivity of a network with the response of individual proteins to signals which travel through the network. The activity levels of signaling proteins computed through existing non-parametric modeling tools do not show significant correlations with the observed values in experimental results. In this work we developed a non-parametric computational framework to describe the profile of the evolving process and the time course of the proportion of active form of molecules in the signal transduction networks. The model is also capable of incorporating perturbations. The model was validated on four signaling networks showing that it can effectively uncover the activity levels and trends of response during signal transduction process. PMID:22737250
NASA Astrophysics Data System (ADS)
Schaeck, S.; Karspeck, T.; Ott, C.; Weckler, M.; Stoermer, A. O.
2011-03-01
In March 2007 the BMW Group has launched the micro-hybrid functions brake energy regeneration (BER) and automatic start and stop function (ASSF). Valve-regulated lead-acid (VRLA) batteries in absorbent glass mat (AGM) technology are applied in vehicles with micro-hybrid power system (MHPS). In both part I and part II of this publication vehicles with MHPS and AGM batteries are subject to a field operational test (FOT). Test vehicles with conventional power system (CPS) and flooded batteries were used as a reference. In the FOT sample batteries were mounted several times and electrically tested in the laboratory intermediately. Vehicle- and battery-related diagnosis data were read out for each test run and were matched with laboratory data in a data base. The FOT data were analyzed by the use of two-dimensional, nonparametric kernel estimation for clear data presentation. The data show that capacity loss in the MHPS is comparable to the CPS. However, the influence of mileage performance, which cannot be separated, suggests that battery stress is enhanced in the MHPS although a battery refresh function is applied. Anyway, the FOT demonstrates the unsuitability of flooded batteries for the MHPS because of high early capacity loss due to acid stratification and because of vanishing cranking performance due to increasing internal resistance. Furthermore, the lack of dynamic charge acceptance for high energy regeneration efficiency is illustrated. Under the presented FOT conditions charge acceptance of lead-acid (LA) batteries decreases to less than one third for about half of the sample batteries compared to new battery condition. In part II of this publication FOT data are presented by multiple regression analysis (Schaeck et al., submitted for publication [1]).
Variable Selection for Nonparametric Quantile Regression via Smoothing Spline AN OVA
Lin, Chen-Yen; Bondell, Howard; Zhang, Hao Helen; Zou, Hui
2014-01-01
Quantile regression provides a more thorough view of the effect of covariates on a response. Nonparametric quantile regression has become a viable alternative to avoid restrictive parametric assumption. The problem of variable selection for quantile regression is challenging, since important variables can influence various quantiles in different ways. We tackle the problem via regularization in the context of smoothing spline ANOVA models. The proposed sparse nonparametric quantile regression (SNQR) can identify important variables and provide flexible estimates for quantiles. Our numerical study suggests the promising performance of the new procedure in variable selection and function estimation. Supplementary materials for this article are available online. PMID:24554792
The Influence of Glove Type on Simulated Wheelchair Racing Propulsion: A Pilot Study.
Rice, I; Dysterheft, J; Bleakney, A W; Cooper, R A
2016-01-01
Our purpose was to examine the influence of glove type on kinetic and spatiotemporal parameters at the handrim in elite wheelchair racers. Elite wheelchair racers (n=9) propelled on a dynamometer in their own racing chairs with a force and moment sensing wheel attached. Racers propelled at 3 steady state speeds (5.36, 6.26 & 7.60 m/s) and performed one maximal effort sprint with 2 different glove types (soft & solid). Peak resultant force, peak torque, impulse, contact angle, braking torque, push time, velocity, and stroke frequency were recorded for steady state and sprint conditions. Multiple nonparametric Wilcoxon matched pair's tests were used to detect differences between glove types, while effect sizes were calculated based on Cohen's d. During steady state trials, racers propelled faster, using more strokes and larger contact angle, while applying less impulse with solid gloves compared to soft gloves. During the sprint condition, racers achieved greater top end velocities, applying larger peak force, with less braking torque with solid gloves compared to soft gloves. Use of solid gloves may provide some performance benefits to wheelchair racers during steady state and top end velocity conditions. © Georg Thieme Verlag KG Stuttgart · New York.
Amyotrophic lateral sclerosis progression and stability of brain-computer interface communication.
Silvoni, Stefano; Cavinato, Marianna; Volpato, Chiara; Ruf, Carolin A; Birbaumer, Niels; Piccione, Francesco
2013-09-01
Our objective was to investigate the relationship between brain-computer interface (BCI) communication skill and disease progression in amyotrophic lateral sclerosis (ALS). We sought also to assess stability of BCI communication performance over time and whether it is related to the progression of neurological impairment before entering the locked-in state. A three years follow-up, BCI evaluation in a group of ALS patients (n = 24) was conducted. For a variety of reasons only three patients completed the three years follow-up. BCI communication skill and disability level, using the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised, were assessed at admission and at each of the three follow-ups. Multiple non-parametric statistical methods were used to ensure reliability of the dependent variables: correlations, paired test and factor analysis of variance. Results demonstrated no significant relationship between BCI communication skill (BCI-CS) and disease evolution. The patients who performed the follow-up evaluations preserved their BCI-CS over time. Patients' age at admission correlated positively with the ability to achieve control over a BCI. In conclusion, disease evolution in ALS does not affect the ability to control a BCI for communication. BCI performance can be maintained in the different stages of the illness.
Hermann, B P; Wyler, A R; Somes, G
1992-01-01
This investigation evaluated the role of preoperative psychological adjustment, degree of postoperative seizure reduction, and other relevant variables (age, education, IQ, age at onset of epilepsy, laterality of resection) in determining emotional/psychosocial outcome following anterior temporal lobectomy. Ninety seven patients with complex partial seizures of temporal lobe origin were administered the Minnesota Multiphasic Personality Inventory (MMPI), Washington Psychosocial Seizure Inventory (WPSI), and the General Health Questionnaire (GHQ) both before and six to eight months after anterior temporal lobectomy. The data were subjected to a nonparametric rank sum technique (O'Brien's procedure) which combined the test scores to form a single outcome index (TOTAL PSYCHOSOCIAL OUTCOME) that was analysed by multiple regression procedures. Results indicated that the most powerful predictors of patients' overall postoperative psychosocial outcome were: 1) The adequacy of their preoperative psychosocial adjustment, and 2) A totally seizure-free outcome. Additional analyses were carried out separately on the MMPI, WPSI, and GHQ to determine whether findings varied as a function of the specific outcome measure. These results were related to the larger literature concerned with the psychological outcome of anterior temporal lobectomy. PMID:1619418
Neurogenic bowel dysfunction (NBD) translation and linguistic validation to classical Arabic.
Mallek, A; Elleuch, M H; Ghroubi, S
2016-09-01
To translate and linguistically validate in classical Arabic; the French version of the neurogenic bowel dysfunction (NBD). Arabic translation of the NBD score was obtained by the "forward translation/backword translation" method. Patients with multiple sclerosis (MS) and spinal cord injury were included. Evaluation of intestinal and anorectal disorders was conducted by the self-administered questionnaire NBD, which was filled twice two weeks apart. An item-by-item analysis was made. The feasibility, acceptability, internal consistency using Cronbach's alpha, and test-retest repeatability by non-parametric Spearman correlation were studied. Twenty-three patients with colorectal disorders secondary to neurological disease were included, the average age was 40.79±9.16years and the sex-ratio was 1.85. The questionnaire was feasible and acceptable, no items were excluded. The spearman correlation was of 0.842. Internal consistency was judged good through the Cronbach's alpha was of 0.896. The Arabic version of NBD was reproducible and construct validity was satisfactory. The study of its responsiveness to change with a larger number of patients will be the subject of further work. 4. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.
Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe
2017-12-27
The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.
BROCCOLI: Software for fast fMRI analysis on many-core CPUs and GPUs
Eklund, Anders; Dufort, Paul; Villani, Mattias; LaConte, Stephen
2014-01-01
Analysis of functional magnetic resonance imaging (fMRI) data is becoming ever more computationally demanding as temporal and spatial resolutions improve, and large, publicly available data sets proliferate. Moreover, methodological improvements in the neuroimaging pipeline, such as non-linear spatial normalization, non-parametric permutation tests and Bayesian Markov Chain Monte Carlo approaches, can dramatically increase the computational burden. Despite these challenges, there do not yet exist any fMRI software packages which leverage inexpensive and powerful graphics processing units (GPUs) to perform these analyses. Here, we therefore present BROCCOLI, a free software package written in OpenCL (Open Computing Language) that can be used for parallel analysis of fMRI data on a large variety of hardware configurations. BROCCOLI has, for example, been tested with an Intel CPU, an Nvidia GPU, and an AMD GPU. These tests show that parallel processing of fMRI data can lead to significantly faster analysis pipelines. This speedup can be achieved on relatively standard hardware, but further, dramatic speed improvements require only a modest investment in GPU hardware. BROCCOLI (running on a GPU) can perform non-linear spatial normalization to a 1 mm3 brain template in 4–6 s, and run a second level permutation test with 10,000 permutations in about a minute. These non-parametric tests are generally more robust than their parametric counterparts, and can also enable more sophisticated analyses by estimating complicated null distributions. Additionally, BROCCOLI includes support for Bayesian first-level fMRI analysis using a Gibbs sampler. The new software is freely available under GNU GPL3 and can be downloaded from github (https://github.com/wanderine/BROCCOLI/). PMID:24672471
An Instructional Module on Mokken Scale Analysis
ERIC Educational Resources Information Center
Wind, Stefanie A.
2017-01-01
Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic-nonparametric framework in which to explore…
Regionalizing nonparametric models of precipitation amounts on different temporal scales
NASA Astrophysics Data System (ADS)
Mosthaf, Tobias; Bárdossy, András
2017-05-01
Parametric distribution functions are commonly used to model precipitation amounts corresponding to different durations. The precipitation amounts themselves are crucial for stochastic rainfall generators and weather generators. Nonparametric kernel density estimates (KDEs) offer a more flexible way to model precipitation amounts. As already stated in their name, these models do not exhibit parameters that can be easily regionalized to run rainfall generators at ungauged locations as well as at gauged locations. To overcome this deficiency, we present a new interpolation scheme for nonparametric models and evaluate it for different temporal resolutions ranging from hourly to monthly. During the evaluation, the nonparametric methods are compared to commonly used parametric models like the two-parameter gamma and the mixed-exponential distribution. As water volume is considered to be an essential parameter for applications like flood modeling, a Lorenz-curve-based criterion is also introduced. To add value to the estimation of data at sub-daily resolutions, we incorporated the plentiful daily measurements in the interpolation scheme, and this idea was evaluated. The study region is the federal state of Baden-Württemberg in the southwest of Germany with more than 500 rain gauges. The validation results show that the newly proposed nonparametric interpolation scheme provides reasonable results and that the incorporation of daily values in the regionalization of sub-daily models is very beneficial.
NASA Astrophysics Data System (ADS)
Protasov, Konstantin T.; Pushkareva, Tatyana Y.; Artamonov, Evgeny S.
2002-02-01
The problem of cloud field recognition from the NOAA satellite data is urgent for solving not only meteorological problems but also for resource-ecological monitoring of the Earth's underlying surface associated with the detection of thunderstorm clouds, estimation of the liquid water content of clouds and the moisture of the soil, the degree of fire hazard, etc. To solve these problems, we used the AVHRR/NOAA video data that regularly displayed the situation in the territory. The complexity and extremely nonstationary character of problems to be solved call for the use of information of all spectral channels, mathematical apparatus of testing statistical hypotheses, and methods of pattern recognition and identification of the informative parameters. For a class of detection and pattern recognition problems, the average risk functional is a natural criterion for the quality and the information content of the synthesized decision rules. In this case, to solve efficiently the problem of identifying cloud field types, the informative parameters must be determined by minimization of this functional. Since the conditional probability density functions, representing mathematical models of stochastic patterns, are unknown, the problem of nonparametric reconstruction of distributions from the leaning samples arises. To this end, we used nonparametric estimates of distributions with the modified Epanechnikov kernel. The unknown parameters of these distributions were determined by minimization of the risk functional, which for the learning sample was substituted by the empirical risk. After the conditional probability density functions had been reconstructed for the examined hypotheses, a cloudiness type was identified using the Bayes decision rule.
Sample Skewness as a Statistical Measurement of Neuronal Tuning Sharpness
Samonds, Jason M.; Potetz, Brian R.; Lee, Tai Sing
2014-01-01
We propose using the statistical measurement of the sample skewness of the distribution of mean firing rates of a tuning curve to quantify sharpness of tuning. For some features, like binocular disparity, tuning curves are best described by relatively complex and sometimes diverse functions, making it difficult to quantify sharpness with a single function and parameter. Skewness provides a robust nonparametric measure of tuning curve sharpness that is invariant with respect to the mean and variance of the tuning curve and is straightforward to apply to a wide range of tuning, including simple orientation tuning curves and complex object tuning curves that often cannot even be described parametrically. Because skewness does not depend on a specific model or function of tuning, it is especially appealing to cases of sharpening where recurrent interactions among neurons produce sharper tuning curves that deviate in a complex manner from the feedforward function of tuning. Since tuning curves for all neurons are not typically well described by a single parametric function, this model independence additionally allows skewness to be applied to all recorded neurons, maximizing the statistical power of a set of data. We also compare skewness with other nonparametric measures of tuning curve sharpness and selectivity. Compared to these other nonparametric measures tested, skewness is best used for capturing the sharpness of multimodal tuning curves defined by narrow peaks (maximum) and broad valleys (minima). Finally, we provide a more formal definition of sharpness using a shape-based information gain measure and derive and show that skewness is correlated with this definition. PMID:24555451
Exact nonparametric confidence bands for the survivor function.
Matthews, David
2013-10-12
A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.
A support vector machine based test for incongruence between sets of trees in tree space
2012-01-01
Background The increased use of multi-locus data sets for phylogenetic reconstruction has increased the need to determine whether a set of gene trees significantly deviate from the phylogenetic patterns of other genes. Such unusual gene trees may have been influenced by other evolutionary processes such as selection, gene duplication, or horizontal gene transfer. Results Motivated by this problem we propose a nonparametric goodness-of-fit test for two empirical distributions of gene trees, and we developed the software GeneOut to estimate a p-value for the test. Our approach maps trees into a multi-dimensional vector space and then applies support vector machines (SVMs) to measure the separation between two sets of pre-defined trees. We use a permutation test to assess the significance of the SVM separation. To demonstrate the performance of GeneOut, we applied it to the comparison of gene trees simulated within different species trees across a range of species tree depths. Applied directly to sets of simulated gene trees with large sample sizes, GeneOut was able to detect very small differences between two set of gene trees generated under different species trees. Our statistical test can also include tree reconstruction into its test framework through a variety of phylogenetic optimality criteria. When applied to DNA sequence data simulated from different sets of gene trees, results in the form of receiver operating characteristic (ROC) curves indicated that GeneOut performed well in the detection of differences between sets of trees with different distributions in a multi-dimensional space. Furthermore, it controlled false positive and false negative rates very well, indicating a high degree of accuracy. Conclusions The non-parametric nature of our statistical test provides fast and efficient analyses, and makes it an applicable test for any scenario where evolutionary or other factors can lead to trees with different multi-dimensional distributions. The software GeneOut is freely available under the GNU public license. PMID:22909268
Assessment of Dimensionality in Social Science Subtest
ERIC Educational Resources Information Center
Ozbek Bastug, Ozlem Yesim
2012-01-01
Most of the literature on dimensionality focused on either comparison of parametric and nonparametric dimensionality detection procedures or showing the effectiveness of one type of procedure. There is no known study to shown how to do combined parametric and nonparametric dimensionality analysis on real data. The current study is aimed to fill…
EEG Correlates of Fluctuation in Cognitive Performance in an Air Traffic Control Task
2014-11-01
using non-parametric statistical analysis to identify neurophysiological patterns due to the time-on-task effect. Significant changes in EEG power...EEG, Cognitive Performance, Power Spectral Analysis , Non-Parametric Analysis Document is available to the public through the Internet...3 Performance Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 EEG
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
USDA-ARS?s Scientific Manuscript database
Parametric non-linear regression (PNR) techniques commonly are used to develop weed seedling emergence models. Such techniques, however, require statistical assumptions that are difficult to meet. To examine and overcome these limitations, we compared PNR with a nonparametric estimation technique. F...
A Comparison of Methods for Nonparametric Estimation of Item Characteristic Curves for Binary Items
ERIC Educational Resources Information Center
Lee, Young-Sun
2007-01-01
This study compares the performance of three nonparametric item characteristic curve (ICC) estimation procedures: isotonic regression, smoothed isotonic regression, and kernel smoothing. Smoothed isotonic regression, employed along with an appropriate kernel function, provides better estimates and also satisfies the assumption of strict…
Measuring Youth Development: A Nonparametric Cross-Country "Youth Welfare Index"
ERIC Educational Resources Information Center
Chaaban, Jad M.
2009-01-01
This paper develops an empirical methodology for the construction of a synthetic multi-dimensional cross-country comparison of the performance of governments around the world in improving the livelihood of their younger population. The devised "Youth Welfare Index" is based on the nonparametric Data Envelopment Analysis (DEA) methodology and…
Order-Constrained Bayes Inference for Dichotomous Models of Unidimensional Nonparametric IRT
ERIC Educational Resources Information Center
Karabatsos, George; Sheu, Ching-Fan
2004-01-01
This study introduces an order-constrained Bayes inference framework useful for analyzing data containing dichotomous scored item responses, under the assumptions of either the monotone homogeneity model or the double monotonicity model of nonparametric item response theory (NIRT). The framework involves the implementation of Gibbs sampling to…
Dong, Qi; Elliott, Michael R; Raghunathan, Trivellore E
2014-06-01
Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.
Dong, Qi; Elliott, Michael R.; Raghunathan, Trivellore E.
2017-01-01
Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs. PMID:29200608
Repeat sample intraocular pressure variance in induced and naturally ocular hypertensive monkeys.
Dawson, William W; Dawson, Judyth C; Hope, George M; Brooks, Dennis E; Percicot, Christine L
2005-12-01
To compare repeat-sample means variance of laser induced ocular hypertension (OH) in rhesus monkeys with the repeat-sample mean variance of natural OH in age-range matched monkeys of similar and dissimilar pedigrees. Multiple monocular, retrospective, intraocular pressure (IOP) measures were recorded repeatedly during a short sampling interval (SSI, 1-5 months) and a long sampling interval (LSI, 6-36 months). There were 5-13 eyes in each SSI and LSI subgroup. Each interval contained subgroups from the Florida with natural hypertension (NHT), induced hypertension (IHT1) Florida monkeys, unrelated (Strasbourg, France) induced hypertensives (IHT2), and Florida age-range matched controls (C). Repeat-sample individual variance means and related IOPs were analyzed by a parametric analysis of variance (ANOV) and results compared to non-parametric Kruskal-Wallis ANOV. As designed, all group intraocular pressure distributions were significantly different (P < or = 0.009) except for the two (Florida/Strasbourg) induced OH groups. A parametric 2 x 4 design ANOV for mean variance showed large significant effects due to treatment group and sampling interval. Similar results were produced by the nonparametric ANOV. Induced OH sample variance (LSI) was 43x the natural OH sample variance-mean. The same relationship for the SSI was 12x. Laser induced ocular hypertension in rhesus monkeys produces large IOP repeat-sample variance mean results compared to controls and natural OH.
Bayesian Nonparametric Prediction and Statistical Inference
1989-09-07
Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See
A note on the IQ of monozygotic twins raised apart and the order of their birth.
Pencavel, J H
1976-10-01
This note examines James Shields' sample of monozygotic twins raised apart to entertain the hypothesis that there is a significant association between the measured IQ of these twins and the order of their birth. A non-parametric test supports this hypothesis and then a linear probability function is estimated that discriminates the effects on IQ of birth order from the effects of birth weight.
Proceedings of the Conference on the Design of Experiments (23rd) S
1978-07-01
of Statistics, Carnegie-Mellon University. * [12] Duran , B. S . (1976). A survey of nonparametric tests for scale. Comunications in Statistics A5, 1287...the twenty-third Design of Experiments Conference was the U. S . Army Combat Development Experimentation Command, Fort Ord, California. Excellent...Availability Prof. G. E. P. Box Time Series Modelling University of Wisconsin Dr. Churchill Eisenhart was recipient this year of the Samuel S . Wilks Memorial
Analysis of quantitative data obtained from toxicity studies showing non-normal distribution.
Kobayashi, Katsumi
2005-05-01
The data obtained from toxicity studies are examined for homogeneity of variance, but, usually, they are not examined for normal distribution. In this study I examined the measured items of a carcinogenicity/chronic toxicity study with rats for both homogeneity of variance and normal distribution. It was observed that a lot of hematology and biochemistry items showed non-normal distribution. For testing normal distribution of the data obtained from toxicity studies, the data of the concurrent control group may be examined, and for the data that show a non-normal distribution, non-parametric tests with robustness may be applied.
Nonparametric probability density estimation by optimization theoretic techniques
NASA Technical Reports Server (NTRS)
Scott, D. W.
1976-01-01
Two nonparametric probability density estimators are considered. The first is the kernel estimator. The problem of choosing the kernel scaling factor based solely on a random sample is addressed. An interactive mode is discussed and an algorithm proposed to choose the scaling factor automatically. The second nonparametric probability estimate uses penalty function techniques with the maximum likelihood criterion. A discrete maximum penalized likelihood estimator is proposed and is shown to be consistent in the mean square error. A numerical implementation technique for the discrete solution is discussed and examples displayed. An extensive simulation study compares the integrated mean square error of the discrete and kernel estimators. The robustness of the discrete estimator is demonstrated graphically.
NASA Astrophysics Data System (ADS)
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
NASA Astrophysics Data System (ADS)
Sumantari, Y. D.; Slamet, I.; Sugiyanto
2017-06-01
Semiparametric regression is a statistical analysis method that consists of parametric and nonparametric regression. There are various approach techniques in nonparametric regression. One of the approach techniques is spline. Central Java is one of the most densely populated province in Indonesia. Population density in this province can be modeled by semiparametric regression because it consists of parametric and nonparametric component. Therefore, the purpose of this paper is to determine the factors that in uence population density in Central Java using the semiparametric spline regression model. The result shows that the factors which in uence population density in Central Java is Family Planning (FP) active participants and district minimum wage.
Monitoring the Level of Students' GPAs over Time
ERIC Educational Resources Information Center
Bakir, Saad T.; McNeal, Bob
2010-01-01
A nonparametric (or distribution-free) statistical quality control chart is used to monitor the cumulative grade point averages (GPAs) of students over time. The chart is designed to detect any statistically significant positive or negative shifts in student GPAs from a desired target level. This nonparametric control chart is based on the…
ERIC Educational Resources Information Center
Kogar, Hakan
2018-01-01
The aim of the present research study was to compare the findings from the nonparametric MSA, DIMTEST and DETECT and the parametric dimensionality determining methods in various simulation conditions by utilizing exploratory and confirmatory methods. For this purpose, various simulation conditions were established based on number of dimensions,…
Three Classes of Nonparametric Differential Step Functioning Effect Estimators
ERIC Educational Resources Information Center
Penfield, Randall D.
2008-01-01
The examination of measurement invariance in polytomous items is complicated by the possibility that the magnitude and sign of lack of invariance may vary across the steps underlying the set of polytomous response options, a concept referred to as differential step functioning (DSF). This article describes three classes of nonparametric DSF effect…
Does Private Tutoring Work? The Effectiveness of Private Tutoring: A Nonparametric Bounds Analysis
ERIC Educational Resources Information Center
Hof, Stefanie
2014-01-01
Private tutoring has become popular throughout the world. However, evidence for the effect of private tutoring on students' academic outcome is inconclusive; therefore, this paper presents an alternative framework: a nonparametric bounds method. The present examination uses, for the first time, a large representative data-set in a European setting…
Estimation of Spatial Dynamic Nonparametric Durbin Models with Fixed Effects
ERIC Educational Resources Information Center
Qian, Minghui; Hu, Ridong; Chen, Jianwei
2016-01-01
Spatial panel data models have been widely studied and applied in both scientific and social science disciplines, especially in the analysis of spatial influence. In this paper, we consider the spatial dynamic nonparametric Durbin model (SDNDM) with fixed effects, which takes the nonlinear factors into account base on the spatial dynamic panel…
The relationship of the concentration of air pollutants to wind direction has been determined by nonparametric regression using a Gaussian kernel. The results are smooth curves with error bars that allow for the accurate determination of the wind direction where the concentrat...
Nonparametric Trajectory Analysis (NTA), a receptor-oriented model, was used to assess the impact of local sources of air pollution at monitoring sites located adjacent to highway I-15 in Las Vegas, NV. Measurements of black carbon, carbon monoxide, nitrogen oxides, and sulfur di...
NASA Astrophysics Data System (ADS)
Wibowo, Wahyu; Wene, Chatrien; Budiantara, I. Nyoman; Permatasari, Erma Oktania
2017-03-01
Multiresponse semiparametric regression is simultaneous equation regression model and fusion of parametric and nonparametric model. The regression model comprise several models and each model has two components, parametric and nonparametric. The used model has linear function as parametric and polynomial truncated spline as nonparametric component. The model can handle both linearity and nonlinearity relationship between response and the sets of predictor variables. The aim of this paper is to demonstrate the application of the regression model for modeling of effect of regional socio-economic on use of information technology. More specific, the response variables are percentage of households has access to internet and percentage of households has personal computer. Then, predictor variables are percentage of literacy people, percentage of electrification and percentage of economic growth. Based on identification of the relationship between response and predictor variable, economic growth is treated as nonparametric predictor and the others are parametric predictors. The result shows that the multiresponse semiparametric regression can be applied well as indicate by the high coefficient determination, 90 percent.
Siciliani, Luigi
2006-01-01
Policy makers are increasingly interested in developing performance indicators that measure hospital efficiency. These indicators may give the purchasers of health services an additional regulatory tool to contain health expenditure. Using panel data, this study compares different parametric (econometric) and non-parametric (linear programming) techniques for the measurement of a hospital's technical efficiency. This comparison was made using a sample of 17 Italian hospitals in the years 1996-9. Highest correlations are found in the efficiency scores between the non-parametric data envelopment analysis under the constant returns to scale assumption (DEA-CRS) and several parametric models. Correlation reduces markedly when using more flexible non-parametric specifications such as data envelopment analysis under the variable returns to scale assumption (DEA-VRS) and the free disposal hull (FDH) model. Correlation also generally reduces when moving from one output to two-output specifications. This analysis suggests that there is scope for developing performance indicators at hospital level using panel data, but it is important that extensive sensitivity analysis is carried out if purchasers wish to make use of these indicators in practice.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guo, Yanrong; Shao, Yeqin; Gao, Yaozong
Purpose: Automatic prostate segmentation from MR images is an important task in various clinical applications such as prostate cancer staging and MR-guided radiotherapy planning. However, the large appearance and shape variations of the prostate in MR images make the segmentation problem difficult to solve. Traditional Active Shape/Appearance Model (ASM/AAM) has limited accuracy on this problem, since its basic assumption, i.e., both shape and appearance of the targeted organ follow Gaussian distributions, is invalid in prostate MR images. To this end, the authors propose a sparse dictionary learning method to model the image appearance in a nonparametric fashion and further integratemore » the appearance model into a deformable segmentation framework for prostate MR segmentation. Methods: To drive the deformable model for prostate segmentation, the authors propose nonparametric appearance and shape models. The nonparametric appearance model is based on a novel dictionary learning method, namely distributed discriminative dictionary (DDD) learning, which is able to capture fine distinctions in image appearance. To increase the differential power of traditional dictionary-based classification methods, the authors' DDD learning approach takes three strategies. First, two dictionaries for prostate and nonprostate tissues are built, respectively, using the discriminative features obtained from minimum redundancy maximum relevance feature selection. Second, linear discriminant analysis is employed as a linear classifier to boost the optimal separation between prostate and nonprostate tissues, based on the representation residuals from sparse representation. Third, to enhance the robustness of the authors' classification method, multiple local dictionaries are learned for local regions along the prostate boundary (each with small appearance variations), instead of learning one global classifier for the entire prostate. These discriminative dictionaries are located on different patches of the prostate surface and trained to adaptively capture the appearance in different prostate zones, thus achieving better local tissue differentiation. For each local region, multiple classifiers are trained based on the randomly selected samples and finally assembled by a specific fusion method. In addition to this nonparametric appearance model, a prostate shape model is learned from the shape statistics using a novel approach, sparse shape composition, which can model nonGaussian distributions of shape variation and regularize the 3D mesh deformation by constraining it within the observed shape subspace. Results: The proposed method has been evaluated on two datasets consisting of T2-weighted MR prostate images. For the first (internal) dataset, the classification effectiveness of the authors' improved dictionary learning has been validated by comparing it with three other variants of traditional dictionary learning methods. The experimental results show that the authors' method yields a Dice Ratio of 89.1% compared to the manual segmentation, which is more accurate than the three state-of-the-art MR prostate segmentation methods under comparison. For the second dataset, the MICCAI 2012 challenge dataset, the authors' proposed method yields a Dice Ratio of 87.4%, which also achieves better segmentation accuracy than other methods under comparison. Conclusions: A new magnetic resonance image prostate segmentation method is proposed based on the combination of deformable model and dictionary learning methods, which achieves more accurate segmentation performance on prostate T2 MR images.« less
Kharroubi, Samer A; Brazier, John E; McGhee, Sarah
2013-01-01
This article reports on the findings from applying a recently described approach to modeling health state valuation data and the impact of the respondent characteristics on health state valuations. The approach applies a nonparametric model to estimate a Bayesian six-dimensional health state short form (derived from short-form 36 health survey) health state valuation algorithm. A sample of 197 states defined by the six-dimensional health state short form (derived from short-form 36 health survey)has been valued by a representative sample of the Hong Kong general population by using standard gamble. The article reports the application of the nonparametric model and compares it to the original model estimated by using a conventional parametric random effects model. The two models are compared theoretically and in terms of empirical performance. Advantages of the nonparametric model are that it can be used to predict scores in populations with different distributions of characteristics than observed in the survey sample and that it allows for the impact of respondent characteristics to vary by health state (while ensuring that full health passes through unity). The results suggest an important age effect with sex, having some effect, but the remaining covariates having no discernible effect. The nonparametric Bayesian model is argued to be more theoretically appropriate than previously used parametric models. Furthermore, it is more flexible to take into account the impact of covariates. Copyright © 2013, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc.
Privacy-preserving Kruskal-Wallis test.
Guo, Suxin; Zhong, Sheng; Zhang, Aidong
2013-10-01
Statistical tests are powerful tools for data analysis. Kruskal-Wallis test is a non-parametric statistical test that evaluates whether two or more samples are drawn from the same distribution. It is commonly used in various areas. But sometimes, the use of the method is impeded by privacy issues raised in fields such as biomedical research and clinical data analysis because of the confidential information contained in the data. In this work, we give a privacy-preserving solution for the Kruskal-Wallis test which enables two or more parties to coordinately perform the test on the union of their data without compromising their data privacy. To the best of our knowledge, this is the first work that solves the privacy issues in the use of the Kruskal-Wallis test on distributed data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Distributional Tests for Gravitational Waves from Core-Collapse Supernovae
NASA Astrophysics Data System (ADS)
Szczepanczyk, Marek; LIGO Collaboration
2017-01-01
Core-Collapse Supernovae (CCSN) are spectacular and violent deaths of massive stars. CCSN are some of the most interesting candidates for producing gravitational-waves (GW) transients. Current published results focus on methodologies to detect single GW unmodelled transients. The advantages of these tests are that they do not require a background for which we have an analytical model. Examples of non-parametric tests that will be compared are Kolmogorov-Smirnov, Mann-Whitney, chi squared, and asymmetric chi squared. I will present methodological results using publicly released LIGO-S6 data recolored to the design sensitivity of Advanced LIGO and that will be time lagged between interferometers sites so that the resulting coincident events are not GW.
Ketcham, Jonathan D; Kuminoff, Nicolai V; Powers, Christopher A
2016-12-01
Consumers' enrollment decisions in Medicare Part D can be explained by Abaluck and Gruber’s (2011) model of utility maximization with psychological biases or by a neoclassical version of their model that precludes such biases. We evaluate these competing hypotheses by applying nonparametric tests of utility maximization and model validation tests to administrative data. We find that 79 percent of enrollment decisions from 2006 to 2010 satisfied basic axioms of consumer theory under the assumption of full information. The validation tests provide evidence against widespread psychological biases. In particular, we find that precluding psychological biases improves the structural model's out-of-sample predictions for consumer behavior.
Fitting Item Response Theory Models to Two Personality Inventories: Issues and Insights.
Chernyshenko, O S; Stark, S; Chan, K Y; Drasgow, F; Williams, B
2001-10-01
The present study compared the fit of several IRT models to two personality assessment instruments. Data from 13,059 individuals responding to the US-English version of the Fifth Edition of the Sixteen Personality Factor Questionnaire (16PF) and 1,770 individuals responding to Goldberg's 50 item Big Five Personality measure were analyzed. Various issues pertaining to the fit of the IRT models to personality data were considered. We examined two of the most popular parametric models designed for dichotomously scored items (i.e., the two- and three-parameter logistic models) and a parametric model for polytomous items (Samejima's graded response model). Also examined were Levine's nonparametric maximum likelihood formula scoring models for dichotomous and polytomous data, which were previously found to provide good fits to several cognitive ability tests (Drasgow, Levine, Tsien, Williams, & Mead, 1995). The two- and three-parameter logistic models fit some scales reasonably well but not others; the graded response model generally did not fit well. The nonparametric formula scoring models provided the best fit of the models considered. Several implications of these findings for personality measurement and personnel selection were described.
Estimation of option-implied risk-neutral into real-world density by using calibration function
NASA Astrophysics Data System (ADS)
Bahaludin, Hafizah; Abdullah, Mimi Hafizah
2017-04-01
Option prices contain crucial information that can be used as a reflection of future development of an underlying assets' price. The main objective of this study is to extract the risk-neutral density (RND) and the risk-world density (RWD) of option prices. A volatility function technique is applied by using a fourth order polynomial interpolation to obtain the RNDs. Then, a calibration function is used to convert the RNDs into RWDs. There are two types of calibration function which are parametric and non-parametric calibrations. The density is extracted from the Dow Jones Industrial Average (DJIA) index options with a one month constant maturity from January 2009 until December 2015. The performance of RNDs and RWDs extracted are evaluated by using a density forecasting test. This study found out that the RWDs obtain can provide an accurate information regarding the price of the underlying asset in future compared to that of the RNDs. In addition, empirical evidence suggests that RWDs from a non-parametric calibration has a better accuracy than other densities.
A robust nonparametric framework for reconstruction of stochastic differential equation models
NASA Astrophysics Data System (ADS)
Rajabzadeh, Yalda; Rezaie, Amir Hossein; Amindavar, Hamidreza
2016-05-01
In this paper, we employ a nonparametric framework to robustly estimate the functional forms of drift and diffusion terms from discrete stationary time series. The proposed method significantly improves the accuracy of the parameter estimation. In this framework, drift and diffusion coefficients are modeled through orthogonal Legendre polynomials. We employ the least squares regression approach along with the Euler-Maruyama approximation method to learn coefficients of stochastic model. Next, a numerical discrete construction of mean squared prediction error (MSPE) is established to calculate the order of Legendre polynomials in drift and diffusion terms. We show numerically that the new method is robust against the variation in sample size and sampling rate. The performance of our method in comparison with the kernel-based regression (KBR) method is demonstrated through simulation and real data. In case of real dataset, we test our method for discriminating healthy electroencephalogram (EEG) signals from epilepsy ones. We also demonstrate the efficiency of the method through prediction in the financial data. In both simulation and real data, our algorithm outperforms the KBR method.
O'Sullivan, Finbarr; Muzi, Mark; Spence, Alexander M; Mankoff, David M; O'Sullivan, Janet N; Fitzgerald, Niall; Newman, George C; Krohn, Kenneth A
2009-06-01
Kinetic analysis is used to extract metabolic information from dynamic positron emission tomography (PET) uptake data. The theory of indicator dilutions, developed in the seminal work of Meier and Zierler (1954), provides a probabilistic framework for representation of PET tracer uptake data in terms of a convolution between an arterial input function and a tissue residue. The residue is a scaled survival function associated with tracer residence in the tissue. Nonparametric inference for the residue, a deconvolution problem, provides a novel approach to kinetic analysis-critically one that is not reliant on specific compartmental modeling assumptions. A practical computational technique based on regularized cubic B-spline approximation of the residence time distribution is proposed. Nonparametric residue analysis allows formal statistical evaluation of specific parametric models to be considered. This analysis needs to properly account for the increased flexibility of the nonparametric estimator. The methodology is illustrated using data from a series of cerebral studies with PET and fluorodeoxyglucose (FDG) in normal subjects. Comparisons are made between key functionals of the residue, tracer flux, flow, etc., resulting from a parametric (the standard two-compartment of Phelps et al. 1979) and a nonparametric analysis. Strong statistical evidence against the compartment model is found. Primarily these differences relate to the representation of the early temporal structure of the tracer residence-largely a function of the vascular supply network. There are convincing physiological arguments against the representations implied by the compartmental approach but this is the first time that a rigorous statistical confirmation using PET data has been reported. The compartmental analysis produces suspect values for flow but, notably, the impact on the metabolic flux, though statistically significant, is limited to deviations on the order of 3%-4%. The general advantage of the nonparametric residue analysis is the ability to provide a valid kinetic quantitation in the context of studies where there may be heterogeneity or other uncertainty about the accuracy of a compartmental model approximation of the tissue residue.
McClay, Joseph L; Vunck, Sarah A; Batman, Angela M; Crowley, James J; Vann, Robert E; Beardsley, Patrick M; van den Oord, Edwin J
2015-09-01
Haloperidol is an effective antipsychotic drug for treatment of schizophrenia, but prolonged use can lead to debilitating side effects. To better understand the effects of long-term administration, we measured global metabolic changes in mouse brain following 3 mg/kg/day haloperidol for 28 days. These conditions lead to movement-related side effects in mice akin to those observed in patients after prolonged use. Brain tissue was collected following microwave tissue fixation to arrest metabolism and extracted metabolites were assessed using both liquid and gas chromatography mass spectrometry (MS). Over 300 unique compounds were identified across MS platforms. Haloperidol was found to be present in all test samples and not in controls, indicating experimental validity. Twenty-one compounds differed significantly between test and control groups at the p < 0.05 level. Top compounds were robust to analytical method, also being identified via partial least squares discriminant analysis. Four compounds (sphinganine, N-acetylornithine, leucine and adenosine diphosphate) survived correction for multiple testing in a non-parametric analysis using false discovery rate threshold < 0.1. Pathway analysis of nominally significant compounds (p < 0.05) revealed significant findings for sphingolipid metabolism (p = 0.015) and protein biosynthesis (p = 0.024). Altered sphingolipid metabolism is suggestive of disruptions to myelin. This interpretation is supported by our observation of elevated N-acetyl-aspartyl-glutamate in the haloperidol-treated mice (p = 0.004), a marker previously associated with demyelination. This study further demonstrates the utility of murine neurochemical metabolomics as a method to advance understanding of CNS drug effects.
McClay, Joseph L.; Vunck, Sarah A.; Batman, Angela M.; Crowley, James J.; Vann, Robert E.; Beardsley, Patrick M.; van den Oord, Edwin J.
2015-01-01
Haloperidol is an effective antipsychotic drug for treatment of schizophrenia, but prolonged use can lead to debilitating side effects. To better understand the effects of long-term administration, we measured global metabolic changes in mouse brain following 3 mg/kg/day haloperidol for 28 days. These conditions lead to movement-related side effects in mice akin to those observed in patients after prolonged use. Brain tissue was collected following microwave tissue fixation to arrest metabolism and extracted metabolites were assessed using both liquid and gas chromatography mass spectrometry (MS). Over 300 unique compounds were identified across MS platforms. Haloperidol was found to be present in all test samples and not in controls, indicating experimental validity. Twenty-one compounds differed significantly between test and control groups at the p < 0.05 level. Top compounds were robust to analytical method, also being identified via partial least squares discriminant analysis. Four compounds (sphinganine, N-acetylornithine, leucine and adenosine diphosphate) survived correction for multiple testing in a non-parametric analysis using false discovery rate threshold < 0.1. Pathway analysis of nominally significant compounds (p < 0.05) revealed significant findings for sphingolipid metabolism (p = 0.02) and protein biosynthesis (p = 0.03). Altered sphingolipid metabolism is suggestive of disruptions to myelin. This interpretation is supported by our observation of elevated N-acetylaspartylglutamate in the haloperidol-treated mice (p = 0.004), a marker previously associated with demyelination. This study further demonstrates the utility of murine neurochemical metabolomics as a method to advance understanding of CNS drug effects. PMID:25850894
Sarcoidosis in children: HRCT findings and correlation with pulmonary function tests.
Sileo, C; Epaud, R; Mahloul, M; Beydon, N; Elia, D; Clement, A; Le Pointe, H Ducou
2014-12-01
High-resolution computed tomography (HRCT) plays an important role in the diagnosis and staging of pulmonary sarcoidosis, but implies radiation exposure. In this light, we aimed to describe HRCT findings as well as their relationship with pulmonary function tests (PFT) in children with pulmonary sarcoidosis. In a retrospective study, 18 pediatric patients with sarcoidosis, including 12 with pulmonary abnormalities (PA group) and 6 without pulmonary abnormalities (APA group) were followed over a 16-year period. Relationships between HRCT scores and PFT were studied by non-parametric Spearman's test at diagnosis and by restricted maximum likelihood (REML) analysis during follow-up. Forty-three HRCT were scored. Twelve patients showed abnormal HRCT findings at diagnosis with multiple nodules or micronodules, while ground-glass opacities were seen in 11 patients. Ten patients exhibited pleural thickening or thickening of the fissure and 6 had interlobular septal thickening at diagnosis. No correlation between HRCT and forced vital capacity (FVC), forced expiratory volume in 1 sec (FEV1), forced expiratory flow during the mid-half of the FVC (FEF(25-75)) and specific dynamical compliance (SpecC(Ldyn)) was found at diagnosis. However, linear mixed models showed that changes in total HRCT scores over time were significantly associated with SpecC(Ldyn), FVC, and FEV1 modifications. Radiologic findings in children with pulmonary sarcoidosis were similar to those in adults. HRCT and PFT are both essential investigations at diagnosis; however, the correlation between HRCT pulmonary parenchymal findings and PFT over time suggests the possibility of reducing the number of HRCT during follow-up to decrease unnecessary radiation exposure. © 2013 Wiley Periodicals, Inc.
Robust Identification of Local Adaptation from Allele Frequencies
Günther, Torsten; Coop, Graham
2013-01-01
Comparing allele frequencies among populations that differ in environment has long been a tool for detecting loci involved in local adaptation. However, such analyses are complicated by an imperfect knowledge of population allele frequencies and neutral correlations of allele frequencies among populations due to shared population history and gene flow. Here we develop a set of methods to robustly test for unusual allele frequency patterns and correlations between environmental variables and allele frequencies while accounting for these complications based on a Bayesian model previously implemented in the software Bayenv. Using this model, we calculate a set of “standardized allele frequencies” that allows investigators to apply tests of their choice to multiple populations while accounting for sampling and covariance due to population history. We illustrate this first by showing that these standardized frequencies can be used to detect nonparametric correlations with environmental variables; these correlations are also less prone to spurious results due to outlier populations. We then demonstrate how these standardized allele frequencies can be used to construct a test to detect SNPs that deviate strongly from neutral population structure. This test is conceptually related to FST and is shown to be more powerful, as we account for population history. We also extend the model to next-generation sequencing of population pools—a cost-efficient way to estimate population allele frequencies, but one that introduces an additional level of sampling noise. The utility of these methods is demonstrated in simulations and by reanalyzing human SNP data from the Human Genome Diversity Panel populations and pooled next-generation sequencing data from Atlantic herring. An implementation of our method is available from http://gcbias.org. PMID:23821598
Veauthier, Christian
2013-01-01
Background The Fatigue Severity Scale (FSS) is widely used to assess fatigue, not only in the context of multiple sclerosis-related fatigue, but also in many other medical conditions. Some polysomnographic studies have shown high FSS values in sleep-disordered patients without multiple sclerosis. The Modified Fatigue Impact Scale (MFIS) has increasingly been used in order to assess fatigue, but polysomnographic data investigating sleep-disordered patients are thus far unavailable. Moreover, the pathophysiological link between sleep architecture and fatigue measured with the MFIS and the FSS has not been previously investigated. Methods This was a retrospective observational study (n = 410) with subgroups classified according to sleep diagnosis. The statistical analysis included nonparametric correlation between questionnaire results and polysomnographic data, age and sex, and univariate and multiple logistic regression. Results The multiple logistic regression showed a significant relationship between FSS/MFIS values and younger age and female sex. Moreover, there was a significant relationship between FSS values and number of arousals and between MFIS values and number of awakenings. Conclusion Younger age, female sex, and high number of awakenings and arousals are predictive of fatigue in sleep-disordered patients. Further investigations are needed to find the pathophysiological explanation for these relationships. PMID:24109185
Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.
Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A
2016-01-01
Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.
Sun, Xuejun; Zhang, Qianggong; Kang, Shichang; Guo, Junming; Li, Xiaofei; Yu, Zhengliang; Zhang, Guoshuai; Qu, Dongmei; Huang, Jie; Cong, Zhiyuan; Wu, Guangjian
2018-08-01
Glacierized mountain environments can preserve and release mercury (Hg) and play an important role in regional Hg biogeochemical cycling. However, the behavior of Hg in glacierized mountain environments and its environmental risks remain poorly constrained. In this research, glacier meltwater, runoff and wetland water were sampled in Zhadang-Qugaqie basin (ZQB), a typical glacierized mountain environment in the inland Tibetan Plateau, to investigate Hg distribution and its relevance to environmental risks. The total mercury (THg) concentrations ranged from 0.82 to 6.98ng·L -1 , and non-parametric pairwise multiple comparisons of the THg concentrations among the three different water samples showed that the THg concentrations were comparable. The total methylmercury (TMeHg) concentrations ranged from 0.041 to 0.115ng·L -1 , and non-parametric pairwise multiple comparisons of the TMeHg concentrations showed a significant difference. Both the THg and MeHg concentrations of water samples from the ZQB were comparable to those of other remote areas, indicating that Hg concentrations in the ZQB watershed are equivalent to the global background level. Particulate Hg was the predominant form of Hg in all runoff samples, and was significantly correlated with the total suspended particle (TSP) and not correlated with the dissolved organic carbon (DOC) concentration. The distribution of mercury in the wetland water differed from that of the other water samples. THg exhibited a significant correlation with DOC as well as TMeHg, whereas neither THg nor TMeHg was associated with TSP. Based on the above findings and the results from previous work, we propose a conceptual model illustrating the four Hg distribution zones in glacierized environments. We highlight that wetlands may enhance the potential hazards of Hg released from melting glaciers, making them a vital zone for investigating the environmental effects of Hg in glacierized environments and beyond. Copyright © 2018 Elsevier B.V. All rights reserved.
Detecting abandoned objects using interacting multiple models
NASA Astrophysics Data System (ADS)
Becker, Stefan; Münch, David; Kieritz, Hilke; Hübner, Wolfgang; Arens, Michael
2015-10-01
In recent years, the wide use of video surveillance systems has caused an enormous increase in the amount of data that has to be stored, monitored, and processed. As a consequence, it is crucial to support human operators with automated surveillance applications. Towards this end an intelligent video analysis module for real-time alerting in case of abandoned objects in public spaces is proposed. The overall processing pipeline consists of two major parts. First, person motion is modeled using an Interacting Multiple Model (IMM) filter. The IMM filter estimates the state of a person according to a finite-state, discrete-time Markov chain. Second, the location of persons that stay at a fixed position defines a region of interest, in which a nonparametric background model with dynamic per-pixel state variables identifies abandoned objects. In case of a detected abandoned object, an alarm event is triggered. The effectiveness of the proposed system is evaluated on the PETS 2006 dataset and the i-Lids dataset, both reflecting prototypical surveillance scenarios.
Zou, Kelly H; Resnic, Frederic S; Talos, Ion-Florin; Goldberg-Zimring, Daniel; Bhagwat, Jui G; Haker, Steven J; Kikinis, Ron; Jolesz, Ferenc A; Ohno-Machado, Lucila
2005-10-01
Medical classification accuracy studies often yield continuous data based on predictive models for treatment outcomes. A popular method for evaluating the performance of diagnostic tests is the receiver operating characteristic (ROC) curve analysis. The main objective was to develop a global statistical hypothesis test for assessing the goodness-of-fit (GOF) for parametric ROC curves via the bootstrap. A simple log (or logit) and a more flexible Box-Cox normality transformations were applied to untransformed or transformed data from two clinical studies to predict complications following percutaneous coronary interventions (PCIs) and for image-guided neurosurgical resection results predicted by tumor volume, respectively. We compared a non-parametric with a parametric binormal estimate of the underlying ROC curve. To construct such a GOF test, we used the non-parametric and parametric areas under the curve (AUCs) as the metrics, with a resulting p value reported. In the interventional cardiology example, logit and Box-Cox transformations of the predictive probabilities led to satisfactory AUCs (AUC=0.888; p=0.78, and AUC=0.888; p=0.73, respectively), while in the brain tumor resection example, log and Box-Cox transformations of the tumor size also led to satisfactory AUCs (AUC=0.898; p=0.61, and AUC=0.899; p=0.42, respectively). In contrast, significant departures from GOF were observed without applying any transformation prior to assuming a binormal model (AUC=0.766; p=0.004, and AUC=0.831; p=0.03), respectively. In both studies the p values suggested that transformations were important to consider before applying any binormal model to estimate the AUC. Our analyses also demonstrated and confirmed the predictive values of different classifiers for determining the interventional complications following PCIs and resection outcomes in image-guided neurosurgery.
An Initial Approach for Learning Objects from Experience
2018-05-02
The US Army Research Laboratorys Vehicle Technology Directorate (VTD) and the Human Research and Engineering Directorate , as part of VTDs 6.1 refresh...experience in an open-set framework. We have shown preliminary results from initial tests using a motion detection algorithm to delineate objects which are... performance and our topology can be defined using our nonparametric model. ARL will continue research to determine the best algorithms to use in the pipeline for the APPLE program.
NASA Astrophysics Data System (ADS)
Brown, M. G. L.; He, T.; Liang, S.
2016-12-01
Satellite-derived estimates of incident photosynthetically active radiation (PAR) can be used to monitor global change, are required by most terrestrial ecosystem models, and can be used to estimate primary production according to the theory of light use efficiency. Compared with parametric approaches, non-parametric techniques that include an artificial neural network (ANN), support vector machine regression (SVM), an artificial bee colony (ABC), and a look-up table (LUT) do not require many ancillary data as inputs for the estimation of PAR from satellite data. In this study, a selection of machine learning methods to estimate PAR from MODIS top of atmosphere (TOA) radiances are compared to a LUT approach to determine which techniques might best handle the nonlinear relationship between TOA radiance and incident PAR. Evaluation of these methods (ANN, SVM, and LUT) is performed with ground measurements at seven SURFRAD sites. Due to the design of the ANN, it can handle the nonlinear relationship between TOA radiance and PAR better than linearly interpolating between the values in the LUT; however, training the ANN has to be carried out on an angular-bin basis, which results in a LUT of ANNs. The SVM model may be better for incorporating multiple viewing angles than the ANN; however, both techniques require a large amount of training data, which may introduce a regional bias based on where the most training and validation data are available. Based on the literature, the ABC is a promising alternative to an ANN, SVM regression and a LUT, but further development for this application is required before concrete conclusions can be drawn. For now, the LUT method outperforms the machine-learning techniques, but future work should be directed at developing and testing the ABC method. A simple, robust method to estimate direct and diffuse incident PAR, with minimal inputs and a priori knowledge, would be very useful for monitoring global change of primary production, particularly of pastures and rangeland, which have implications for livestock and food security. Future work will delve deeper into the utility of satellite-derived PAR estimation for monitoring primary production in pasture and rangelands.
ERIC Educational Resources Information Center
Mittag, Kathleen Cage
Most researchers using factor analysis extract factors from a matrix of Pearson product-moment correlation coefficients. A method is presented for extracting factors in a non-parametric way, by extracting factors from a matrix of Spearman rho (rank correlation) coefficients. It is possible to factor analyze a matrix of association such that…
Joint Entropy Minimization for Learning in Nonparametric Framework
2006-06-09
Tibshirani, G. Sherlock , W. C. Chan, T. C. Greiner, D. D. Weisenburger, J. O. Armitage, R. Warnke, R. Levy, W. Wilson, M. R. Grever, J. C. Byrd, D. Botstein, P...Entropy Minimization for Learning in Nonparametric Framework 33 [11] D.L. Collins, A.P. Zijdenbos, J.G. Kollokian, N.J. Sled, C.J. Kabani, C.J. Holmes
Nonparametric estimation of plant density by the distance method
Patil, S.A.; Burnham, K.P.; Kovner, J.L.
1979-01-01
A relation between the plant density and the probability density function of the nearest neighbor distance (squared) from a random point is established under fairly broad conditions. Based upon this relationship, a nonparametric estimator for the plant density is developed and presented in terms of order statistics. Consistency and asymptotic normality of the estimator are discussed. An interval estimator for the density is obtained. The modifications of this estimator and its variance are given when the distribution is truncated. Simulation results are presented for regular, random and aggregated populations to illustrate the nonparametric estimator and its variance. A numerical example from field data is given. Merits and deficiencies of the estimator are discussed with regard to its robustness and variance.
Gut microbiota in early pediatric multiple sclerosis: a case-control study.
Tremlett, Helen; Fadrosh, Douglas W; Faruqi, Ali A; Zhu, Feng; Hart, Janace; Roalstad, Shelly; Graves, Jennifer; Lynch, Susan; Waubant, Emmanuelle
2016-08-01
Alterations in the gut microbial community composition may be influential in neurological disease. Microbial community profiles were compared between early onset pediatric multiple sclerosis (MS) and control children similar for age and sex. Children ≤18 years old within 2 years of MS onset or controls without autoimmune disorders attending a University of California, San Francisco, USA, pediatric clinic were examined for fecal bacterial community composition and predicted function by 16S ribosomal RNA sequencing and phylogenetic reconstruction of unobserved states (PICRUSt) analysis. Associations between subject characteristics and the microbiota, including beta diversity and taxa abundance, were identified using non-parametric tests, permutational multivariate analysis of variance and negative binomial regression. Eighteen relapsing-remitting MS cases and 17 controls (mean age 13 years; range 4-18) were studied. Cases had a short disease duration (mean 11 months; range 2-24) and half were immunomodulatory drug (IMD) naïve. Whilst overall gut bacterial beta diversity was not significantly related to MS status, IMD exposure was (Canberra, P < 0.02). However, relative to controls, MS cases had a significant enrichment in relative abundance for members of the Desulfovibrionaceae (Bilophila, Desulfovibrio and Christensenellaceae) and depletion in Lachnospiraceae and Ruminococcaceae (all P and q < 0.000005). Microbial genes predicted as enriched in MS versus controls included those involved in glutathione metabolism (Mann-Whitney, P = 0.017), findings that were consistent regardless of IMD exposure. In recent onset pediatric MS, perturbations in the gut microbiome composition were observed, in parallel with predicted enrichment of metabolic pathways associated with neurodegeneration. Findings were suggestive of a pro-inflammatory milieu. © 2016 EAN.
Stroup, Caleb N.; Welhan, John A.; Davis, Linda C.
2008-01-01
The statistical stationarity of distributions of sedimentary interbed thicknesses within the southwestern part of the Idaho National Laboratory (INL) was evaluated within the stratigraphic framework of Quaternary sediments and basalts at the INL site, eastern Snake River Plain, Idaho. The thicknesses of 122 sedimentary interbeds observed in 11 coreholes were documented from lithologic logs and independently inferred from natural-gamma logs. Lithologic information was grouped into composite time-stratigraphic units based on correlations with existing composite-unit stratigraphy near these holes. The assignment of lithologic units to an existing chronostratigraphy on the basis of nearby composite stratigraphic units may introduce error where correlations with nearby holes are ambiguous or the distance between holes is great, but we consider this the best technique for grouping stratigraphic information in this geologic environment at this time. Nonparametric tests of similarity were used to evaluate temporal and spatial stationarity in the distributions of sediment thickness. The following statistical tests were applied to the data: (1) the Kolmogorov-Smirnov (K-S) two-sample test to compare distribution shape, (2) the Mann-Whitney (M-W) test for similarity of two medians, (3) the Kruskal-Wallis (K-W) test for similarity of multiple medians, and (4) Levene's (L) test for the similarity of two variances. Results of these analyses corroborate previous work that concluded the thickness distributions of Quaternary sedimentary interbeds are locally stationary in space and time. The data set used in this study was relatively small, so the results presented should be considered preliminary, pending incorporation of data from more coreholes. Statistical tests also demonstrated that natural-gamma logs consistently fail to detect interbeds less than about 2-3 ft thick, although these interbeds are observable in lithologic logs. This should be taken into consideration when modeling aquifer lithology or hydraulic properties based on lithology.
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].
Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel
2017-01-01
The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
NASA Astrophysics Data System (ADS)
Rogers, Jeffrey N.; Parrish, Christopher E.; Ward, Larry G.; Burdick, David M.
2018-03-01
Salt marsh vegetation tends to increase vertical uncertainty in light detection and ranging (lidar) derived elevation data, often causing the data to become ineffective for analysis of topographic features governing tidal inundation or vegetation zonation. Previous attempts at improving lidar data collected in salt marsh environments range from simply computing and subtracting the global elevation bias to more complex methods such as computing vegetation-specific, constant correction factors. The vegetation specific corrections can be used along with an existing habitat map to apply separate corrections to different areas within a study site. It is hypothesized here that correcting salt marsh lidar data by applying location-specific, point-by-point corrections, which are computed from lidar waveform-derived features, tidal-datum based elevation, distance from shoreline and other lidar digital elevation model based variables, using nonparametric regression will produce better results. The methods were developed and tested using full-waveform lidar and ground truth for three marshes in Cape Cod, Massachusetts, U.S.A. Five different model algorithms for nonparametric regression were evaluated, with TreeNet's stochastic gradient boosting algorithm consistently producing better regression and classification results. Additionally, models were constructed to predict the vegetative zone (high marsh and low marsh). The predictive modeling methods used in this study estimated ground elevation with a mean bias of 0.00 m and a standard deviation of 0.07 m (0.07 m root mean square error). These methods appear very promising for correction of salt marsh lidar data and, importantly, do not require an existing habitat map, biomass measurements, or image based remote sensing data such as multi/hyperspectral imagery.
Updating estimates of low streamflow statistics to account for possible trends
NASA Astrophysics Data System (ADS)
Blum, A. G.; Archfield, S. A.; Hirsch, R. M.; Vogel, R. M.; Kiang, J. E.; Dudley, R. W.
2017-12-01
Given evidence of both increasing and decreasing trends in low flows in many streams, methods are needed to update estimators of low flow statistics used in water resources management. One such metric is the 10-year annual low-flow statistic (7Q10) calculated as the annual minimum seven-day streamflow which is exceeded in nine out of ten years on average. Historical streamflow records may not be representative of current conditions at a site if environmental conditions are changing. We present a new approach to frequency estimation under nonstationary conditions that applies a stationary nonparametric quantile estimator to a subset of the annual minimum flow record. Monte Carlo simulation experiments were used to evaluate this approach across a range of trend and no trend scenarios. Relative to the standard practice of using the entire available streamflow record, use of a nonparametric quantile estimator combined with selection of the most recent 30 or 50 years for 7Q10 estimation were found to improve accuracy and reduce bias. Benefits of data subset selection approaches were greater for higher magnitude trends annual minimum flow records with lower coefficients of variation. A nonparametric trend test approach for subset selection did not significantly improve upon always selecting the last 30 years of record. At 174 stream gages in the Chesapeake Bay region, 7Q10 estimators based on the most recent 30 years of flow record were compared to estimators based on the entire period of record. Given the availability of long records of low streamflow, using only a subset of the flow record ( 30 years) can be used to update 7Q10 estimators to better reflect current streamflow conditions.
Khraisat, A; Taha, Sahar T; Jung, R E; Hattar, S; Smadi, L; Al-Omari, I K; Jarbawi, M
2007-09-01
The correlation between dental morphological traits can be used as an indicator to show major ethnic differences. Therefore, this study investigated the prevalence of Carabelli's molar and shovel incisor traits and tested their association and sexual dimorphism in Jordanian population. Three hundred subjects of school children at their 10th grade and of 15.5-year as an average age were involved. Alginate impressions for the maxillary arch were taken, poured, and casts were then trimmed. The selected accurate casts were of 132 male- and 155 female-students. The examined morphologic traits were Carabelli's trait on the maxillary first and second molars and shovel-shaped incisors. The relationship between different traits was investigated by Nonparametric Correlation analysis and Independent Sample t test was used to test sexual dimorphism in trait expression. The prevalence of Carabelli's trait in maxillary first molar and shovel trait in maxillary central incisor was relatively high (65.0 % and 53.0 %, respectively). The prevalence of Carabelli's trait on maxillary second molars was 3.8 %. Nonparametric Correlations revealed a strongest positive correlation between Carabelli's trait on maxillary first molar and shovel trait in males (P = 0.005). Significant sexual dimorphism was only found in the prevalence of Carabelli's trait on maxillary first molar (P = 0.013) and shovel trait (P = 0.038). The Jordanian Population had comparatively high prevalence of Carabelli's molar and shovel incisor traits. There was a positive association between Carabelli's trait on maxillary first molar and shovel trait in males. Sexual dimorphism was evident in Carabelli's trait on maxillary first molar and shovel trait.