A Nonparametric Geostatistical Method For Estimating Species Importance
Andrew J. Lister; Rachel Riemann; Michael Hoppus
2001-01-01
Parametric statistical methods are not always appropriate for conducting spatial analyses of forest inventory data. Parametric geostatistical methods such as variography and kriging are essentially averaging procedures, and thus can be affected by extreme values. Furthermore, non normal distributions violate the assumptions of analyses in which test statistics are...
Bansal, Ravi; Peterson, Bradley S
2018-06-01
Identifying regional effects of interest in MRI datasets usually entails testing a priori hypotheses across many thousands of brain voxels, requiring control for false positive findings in these multiple hypotheses testing. Recent studies have suggested that parametric statistical methods may have incorrectly modeled functional MRI data, thereby leading to higher false positive rates than their nominal rates. Nonparametric methods for statistical inference when conducting multiple statistical tests, in contrast, are thought to produce false positives at the nominal rate, which has thus led to the suggestion that previously reported studies should reanalyze their fMRI data using nonparametric tools. To understand better why parametric methods may yield excessive false positives, we assessed their performance when applied both to simulated datasets of 1D, 2D, and 3D Gaussian Random Fields (GRFs) and to 710 real-world, resting-state fMRI datasets. We showed that both the simulated 2D and 3D GRFs and the real-world data contain a small percentage (<6%) of very large clusters (on average 60 times larger than the average cluster size), which were not present in 1D GRFs. These unexpectedly large clusters were deemed statistically significant using parametric methods, leading to empirical familywise error rates (FWERs) as high as 65%: the high empirical FWERs were not a consequence of parametric methods failing to model spatial smoothness accurately, but rather of these very large clusters that are inherently present in smooth, high-dimensional random fields. In fact, when discounting these very large clusters, the empirical FWER for parametric methods was 3.24%. Furthermore, even an empirical FWER of 65% would yield on average less than one of those very large clusters in each brain-wide analysis. Nonparametric methods, in contrast, estimated distributions from those large clusters, and therefore, by construct rejected the large clusters as false positives at the nominal FWERs. Those rejected clusters were outlying values in the distribution of cluster size but cannot be distinguished from true positive findings without further analyses, including assessing whether fMRI signal in those regions correlates with other clinical, behavioral, or cognitive measures. Rejecting the large clusters, however, significantly reduced the statistical power of nonparametric methods in detecting true findings compared with parametric methods, which would have detected most true findings that are essential for making valid biological inferences in MRI data. Parametric analyses, in contrast, detected most true findings while generating relatively few false positives: on average, less than one of those very large clusters would be deemed a true finding in each brain-wide analysis. We therefore recommend the continued use of parametric methods that model nonstationary smoothness for cluster-level, familywise control of false positives, particularly when using a Cluster Defining Threshold of 2.5 or higher, and subsequently assessing rigorously the biological plausibility of the findings, even for large clusters. Finally, because nonparametric methods yielded a large reduction in statistical power to detect true positive findings, we conclude that the modest reduction in false positive findings that nonparametric analyses afford does not warrant a re-analysis of previously published fMRI studies using nonparametric techniques. Copyright © 2018 Elsevier Inc. All rights reserved.
Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses
Callahan, Ben J.; Sankaran, Kris; Fukuyama, Julia A.; McMurdie, Paul J.; Holmes, Susan P.
2016-01-01
High-throughput sequencing of PCR-amplified taxonomic markers (like the 16S rRNA gene) has enabled a new level of analysis of complex bacterial communities known as microbiomes. Many tools exist to quantify and compare abundance levels or OTU composition of communities in different conditions. The sequencing reads have to be denoised and assigned to the closest taxa from a reference database. Common approaches use a notion of 97% similarity and normalize the data by subsampling to equalize library sizes. In this paper, we show that statistical models allow more accurate abundance estimates. By providing a complete workflow in R, we enable the user to do sophisticated downstream statistical analyses, whether parametric or nonparametric. We provide examples of using the R packages dada2, phyloseq, DESeq2, ggplot2 and vegan to filter, visualize and test microbiome data. We also provide examples of supervised analyses using random forests and nonparametric testing using community networks and the ggnetwork package. PMID:27508062
SOCR Analyses - an Instructional Java Web-based Statistical Analysis Toolkit.
Chu, Annie; Cui, Jenny; Dinov, Ivo D
2009-03-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test.The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website.In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models.
Algorithm for Identifying Erroneous Rain-Gauge Readings
NASA Technical Reports Server (NTRS)
Rickman, Doug
2005-01-01
An algorithm analyzes rain-gauge data to identify statistical outliers that could be deemed to be erroneous readings. Heretofore, analyses of this type have been performed in burdensome manual procedures that have involved subjective judgements. Sometimes, the analyses have included computational assistance for detecting values falling outside of arbitrary limits. The analyses have been performed without statistically valid knowledge of the spatial and temporal variations of precipitation within rain events. In contrast, the present algorithm makes it possible to automate such an analysis, makes the analysis objective, takes account of the spatial distribution of rain gauges in conjunction with the statistical nature of spatial variations in rainfall readings, and minimizes the use of arbitrary criteria. The algorithm implements an iterative process that involves nonparametric statistics.
ERIC Educational Resources Information Center
Fidalgo, Angel M.
2011-01-01
Mantel-Haenszel (MH) methods constitute one of the most popular nonparametric differential item functioning (DIF) detection procedures. GMHDIF has been developed to provide an easy-to-use program for conducting DIF analyses. Some of the advantages of this program are that (a) it performs two-stage DIF analyses in multiple groups simultaneously;…
ERIC Educational Resources Information Center
Douglas, Jeff; Kim, Hae-Rim; Roussos, Louis; Stout, William; Zhang, Jinming
An extensive nonparametric dimensionality analysis of latent structure was conducted on three forms of the Law School Admission Test (LSAT) (December 1991, June 1992, and October 1992) using the DIMTEST model in confirmatory analyses and using DIMTEST, FAC, DETECT, HCA, PROX, and a genetic algorithm in exploratory analyses. Results indicate that…
Nonparametric estimation and testing of fixed effects panel data models
Henderson, Daniel J.; Carroll, Raymond J.; Li, Qi
2009-01-01
In this paper we consider the problem of estimating nonparametric panel data models with fixed effects. We introduce an iterative nonparametric kernel estimator. We also extend the estimation method to the case of a semiparametric partially linear fixed effects model. To determine whether a parametric, semiparametric or nonparametric model is appropriate, we propose test statistics to test between the three alternatives in practice. We further propose a test statistic for testing the null hypothesis of random effects against fixed effects in a nonparametric panel data regression model. Simulations are used to examine the finite sample performance of the proposed estimators and the test statistics. PMID:19444335
A Bayesian nonparametric method for prediction in EST analysis
Lijoi, Antonio; Mena, Ramsés H; Prünster, Igor
2007-01-01
Background Expressed sequence tags (ESTs) analyses are a fundamental tool for gene identification in organisms. Given a preliminary EST sample from a certain library, several statistical prediction problems arise. In particular, it is of interest to estimate how many new genes can be detected in a future EST sample of given size and also to determine the gene discovery rate: these estimates represent the basis for deciding whether to proceed sequencing the library and, in case of a positive decision, a guideline for selecting the size of the new sample. Such information is also useful for establishing sequencing efficiency in experimental design and for measuring the degree of redundancy of an EST library. Results In this work we propose a Bayesian nonparametric approach for tackling statistical problems related to EST surveys. In particular, we provide estimates for: a) the coverage, defined as the proportion of unique genes in the library represented in the given sample of reads; b) the number of new unique genes to be observed in a future sample; c) the discovery rate of new genes as a function of the future sample size. The Bayesian nonparametric model we adopt conveys, in a statistically rigorous way, the available information into prediction. Our proposal has appealing properties over frequentist nonparametric methods, which become unstable when prediction is required for large future samples. EST libraries, previously studied with frequentist methods, are analyzed in detail. Conclusion The Bayesian nonparametric approach we undertake yields valuable tools for gene capture and prediction in EST libraries. The estimators we obtain do not feature the kind of drawbacks associated with frequentist estimators and are reliable for any size of the additional sample. PMID:17868445
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
Enabling a Comprehensive Teaching Strategy: Video Lectures
ERIC Educational Resources Information Center
Brecht, H. David; Ogilby, Suzanne M.
2008-01-01
This study empirically tests the feasibility and effectiveness of video lectures as a form of video instruction that enables a comprehensive teaching strategy used throughout a traditional classroom course. It examines student use patterns and the videos' effects on student learning, using qualitative and nonparametric statistical analyses of…
Efficiency Analysis of Public Universities in Thailand
ERIC Educational Resources Information Center
Kantabutra, Saranya; Tang, John C. S.
2010-01-01
This paper examines the performance of Thai public universities in terms of efficiency, using a non-parametric approach called data envelopment analysis. Two efficiency models, the teaching efficiency model and the research efficiency model, are developed and the analysis is conducted at the faculty level. Further statistical analyses are also…
SOCR Analyses – an Instructional Java Web-based Statistical Analysis Toolkit
Chu, Annie; Cui, Jenny; Dinov, Ivo D.
2011-01-01
The Statistical Online Computational Resource (SOCR) designs web-based tools for educational use in a variety of undergraduate courses (Dinov 2006). Several studies have demonstrated that these resources significantly improve students' motivation and learning experiences (Dinov et al. 2008). SOCR Analyses is a new component that concentrates on data modeling and analysis using parametric and non-parametric techniques supported with graphical model diagnostics. Currently implemented analyses include commonly used models in undergraduate statistics courses like linear models (Simple Linear Regression, Multiple Linear Regression, One-Way and Two-Way ANOVA). In addition, we implemented tests for sample comparisons, such as t-test in the parametric category; and Wilcoxon rank sum test, Kruskal-Wallis test, Friedman's test, in the non-parametric category. SOCR Analyses also include several hypothesis test models, such as Contingency tables, Friedman's test and Fisher's exact test. The code itself is open source (http://socr.googlecode.com/), hoping to contribute to the efforts of the statistical computing community. The code includes functionality for each specific analysis model and it has general utilities that can be applied in various statistical computing tasks. For example, concrete methods with API (Application Programming Interface) have been implemented in statistical summary, least square solutions of general linear models, rank calculations, etc. HTML interfaces, tutorials, source code, activities, and data are freely available via the web (www.SOCR.ucla.edu). Code examples for developers and demos for educators are provided on the SOCR Wiki website. In this article, the pedagogical utilization of the SOCR Analyses is discussed, as well as the underlying design framework. As the SOCR project is on-going and more functions and tools are being added to it, these resources are constantly improved. The reader is strongly encouraged to check the SOCR site for most updated information and newly added models. PMID:21546994
Packham, B; Barnes, G; Dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D
2016-06-01
Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p < 0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity.
Packham, B; Barnes, G; dos Santos, G Sato; Aristovich, K; Gilad, O; Ghosh, A; Oh, T; Holder, D
2016-01-01
Abstract Electrical impedance tomography (EIT) allows for the reconstruction of internal conductivity from surface measurements. A change in conductivity occurs as ion channels open during neural activity, making EIT a potential tool for functional brain imaging. EIT images can have >10 000 voxels, which means statistical analysis of such images presents a substantial multiple testing problem. One way to optimally correct for these issues and still maintain the flexibility of complicated experimental designs is to use random field theory. This parametric method estimates the distribution of peaks one would expect by chance in a smooth random field of a given size. Random field theory has been used in several other neuroimaging techniques but never validated for EIT images of fast neural activity, such validation can be achieved using non-parametric techniques. Both parametric and non-parametric techniques were used to analyze a set of 22 images collected from 8 rats. Significant group activations were detected using both techniques (corrected p < 0.05). Both parametric and non-parametric analyses yielded similar results, although the latter was less conservative. These results demonstrate the first statistical analysis of such an image set and indicate that such an analysis is an approach for EIT images of neural activity. PMID:27203477
An Exploratory Data Analysis System for Support in Medical Decision-Making
Copeland, J. A.; Hamel, B.; Bourne, J. R.
1979-01-01
An experimental system was developed to allow retrieval and analysis of data collected during a study of neurobehavioral correlates of renal disease. After retrieving data organized in a relational data base, simple bivariate statistics of parametric and nonparametric nature could be conducted. An “exploratory” mode in which the system provided guidance in selection of appropriate statistical analyses was also available to the user. The system traversed a decision tree using the inherent qualities of the data (e.g., the identity and number of patients, tests, and time epochs) to search for the appropriate analyses to employ.
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES
He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.
2017-01-01
Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
A Unifying Framework for Teaching Nonparametric Statistical Tests
ERIC Educational Resources Information Center
Bargagliotti, Anna E.; Orrison, Michael E.
2014-01-01
Increased importance is being placed on statistics at both the K-12 and undergraduate level. Research divulging effective methods to teach specific statistical concepts is still widely sought after. In this paper, we focus on best practices for teaching topics in nonparametric statistics at the undergraduate level. To motivate the work, we…
An Empirical Study of Eight Nonparametric Tests in Hierarchical Regression.
ERIC Educational Resources Information Center
Harwell, Michael; Serlin, Ronald C.
When normality does not hold, nonparametric tests represent an important data-analytic alternative to parametric tests. However, the use of nonparametric tests in educational research has been limited by the absence of easily performed tests for complex experimental designs and analyses, such as factorial designs and multiple regression analyses,…
Teaching Nonparametric Statistics Using Student Instrumental Values.
ERIC Educational Resources Information Center
Anderson, Jonathan W.; Diddams, Margaret
Nonparametric statistics are often difficult to teach in introduction to statistics courses because of the lack of real-world examples. This study demonstrated how teachers can use differences in the rankings and ratings of undergraduate and graduate values to discuss: (1) ipsative and normative scaling; (2) uses of the Mann-Whitney U-test; and…
How to Compare Parametric and Nonparametric Person-Fit Statistics Using Real Data
ERIC Educational Resources Information Center
Sinharay, Sandip
2017-01-01
Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
Schulz, Marcus; Neumann, Daniel; Fleet, David M; Matthies, Michael
2013-12-01
During the last decades, marine pollution with anthropogenic litter has become a worldwide major environmental concern. Standardized monitoring of litter since 2001 on 78 beaches selected within the framework of the Convention for the Protection of the Marine Environment of the North-East Atlantic (OSPAR) has been used to identify temporal trends of marine litter. Based on statistical analyses of this dataset a two-part multi-criteria evaluation system for beach litter pollution of the North-East Atlantic and the North Sea is proposed. Canonical correlation analyses, linear regression analyses, and non-parametric analyses of variance were used to identify different temporal trends. A classification of beaches was derived from cluster analyses and served to define different states of beach quality according to abundances of 17 input variables. The evaluation system is easily applicable and relies on the above-mentioned classification and on significant temporal trends implied by significant rank correlations. Copyright © 2013 Elsevier Ltd. All rights reserved.
Monitoring the Level of Students' GPAs over Time
ERIC Educational Resources Information Center
Bakir, Saad T.; McNeal, Bob
2010-01-01
A nonparametric (or distribution-free) statistical quality control chart is used to monitor the cumulative grade point averages (GPAs) of students over time. The chart is designed to detect any statistically significant positive or negative shifts in student GPAs from a desired target level. This nonparametric control chart is based on the…
Statistical methods used in articles published by the Journal of Periodontal and Implant Science.
Choi, Eunsil; Lyu, Jiyoung; Park, Jinyoung; Kim, Hae-Young
2014-12-01
The purposes of this study were to assess the trend of use of statistical methods including parametric and nonparametric methods and to evaluate the use of complex statistical methodology in recent periodontal studies. This study analyzed 123 articles published in the Journal of Periodontal & Implant Science (JPIS) between 2010 and 2014. Frequencies and percentages were calculated according to the number of statistical methods used, the type of statistical method applied, and the type of statistical software used. Most of the published articles considered (64.4%) used statistical methods. Since 2011, the percentage of JPIS articles using statistics has increased. On the basis of multiple counting, we found that the percentage of studies in JPIS using parametric methods was 61.1%. Further, complex statistical methods were applied in only 6 of the published studies (5.0%), and nonparametric statistical methods were applied in 77 of the published studies (38.9% of a total of 198 studies considered). We found an increasing trend towards the application of statistical methods and nonparametric methods in recent periodontal studies and thus, concluded that increased use of complex statistical methodology might be preferred by the researchers in the fields of study covered by JPIS.
Pataky, Todd C; Vanrenterghem, Jos; Robinson, Mark A
2015-05-01
Biomechanical processes are often manifested as one-dimensional (1D) trajectories. It has been shown that 1D confidence intervals (CIs) are biased when based on 0D statistical procedures, and the non-parametric 1D bootstrap CI has emerged in the Biomechanics literature as a viable solution. The primary purpose of this paper was to clarify that, for 1D biomechanics datasets, the distinction between 0D and 1D methods is much more important than the distinction between parametric and non-parametric procedures. A secondary purpose was to demonstrate that a parametric equivalent to the 1D bootstrap exists in the form of a random field theory (RFT) correction for multiple comparisons. To emphasize these points we analyzed six datasets consisting of force and kinematic trajectories in one-sample, paired, two-sample and regression designs. Results showed, first, that the 1D bootstrap and other 1D non-parametric CIs were qualitatively identical to RFT CIs, and all were very different from 0D CIs. Second, 1D parametric and 1D non-parametric hypothesis testing results were qualitatively identical for all six datasets. Last, we highlight the limitations of 1D CIs by demonstrating that they are complex, design-dependent, and thus non-generalizable. These results suggest that (i) analyses of 1D data based on 0D models of randomness are generally biased unless one explicitly identifies 0D variables before the experiment, and (ii) parametric and non-parametric 1D hypothesis testing provide an unambiguous framework for analysis when one׳s hypothesis explicitly or implicitly pertains to whole 1D trajectories. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Kantabutra, Sangchan
2009-01-01
This paper examines urban-rural effects on public upper-secondary school efficiency in northern Thailand. In the study, efficiency was measured by a nonparametric technique, data envelopment analysis (DEA). Urban-rural effects were examined through a Mann-Whitney nonparametric statistical test. Results indicate that urban schools appear to have…
Rediscovery of Good-Turing estimators via Bayesian nonparametrics.
Favaro, Stefano; Nipoti, Bernardo; Teh, Yee Whye
2016-03-01
The problem of estimating discovery probabilities originated in the context of statistical ecology, and in recent years it has become popular due to its frequent appearance in challenging applications arising in genetics, bioinformatics, linguistics, designs of experiments, machine learning, etc. A full range of statistical approaches, parametric and nonparametric as well as frequentist and Bayesian, has been proposed for estimating discovery probabilities. In this article, we investigate the relationships between the celebrated Good-Turing approach, which is a frequentist nonparametric approach developed in the 1940s, and a Bayesian nonparametric approach recently introduced in the literature. Specifically, under the assumption of a two parameter Poisson-Dirichlet prior, we show that Bayesian nonparametric estimators of discovery probabilities are asymptotically equivalent, for a large sample size, to suitably smoothed Good-Turing estimators. As a by-product of this result, we introduce and investigate a methodology for deriving exact and asymptotic credible intervals to be associated with the Bayesian nonparametric estimators of discovery probabilities. The proposed methodology is illustrated through a comprehensive simulation study and the analysis of Expressed Sequence Tags data generated by sequencing a benchmark complementary DNA library. © 2015, The International Biometric Society.
Torres-Carvajal, Omar; Schulte, James A; Cadle, John E
2006-04-01
The South American iguanian lizard genus Stenocercus includes 54 species occurring mostly in the Andes and adjacent lowland areas from northern Venezuela and Colombia to central Argentina at elevations of 0-4000m. Small taxon or character sampling has characterized all phylogenetic analyses of Stenocercus, which has long been recognized as sister taxon to the Tropidurus Group. In this study, we use mtDNA sequence data to perform phylogenetic analyses that include 32 species of Stenocercus and 12 outgroup taxa. Monophyly of this genus is strongly supported by maximum parsimony and Bayesian analyses. Evolutionary relationships within Stenocercus are further analyzed with a Bayesian implementation of a general mixture model, which accommodates variability in the pattern of evolution across sites. These analyses indicate a basal split of Stenocercus into two clades, one of which receives very strong statistical support. In addition, we test previous hypotheses using non-parametric and parametric statistical methods, and provide a phylogenetic classification for Stenocercus.
ERIC Educational Resources Information Center
St-Onge, Christina; Valois, Pierre; Abdous, Belkacem; Germain, Stephane
2009-01-01
To date, there have been no studies comparing parametric and nonparametric Item Characteristic Curve (ICC) estimation methods on the effectiveness of Person-Fit Statistics (PFS). The primary aim of this study was to determine if the use of ICCs estimated by nonparametric methods would increase the accuracy of item response theory-based PFS for…
Parametric vs. non-parametric statistics of low resolution electromagnetic tomography (LORETA).
Thatcher, R W; North, D; Biver, C
2005-01-01
This study compared the relative statistical sensitivity of non-parametric and parametric statistics of 3-dimensional current sources as estimated by the EEG inverse solution Low Resolution Electromagnetic Tomography (LORETA). One would expect approximately 5% false positives (classification of a normal as abnormal) at the P < .025 level of probability (two tailed test) and approximately 1% false positives at the P < .005 level. EEG digital samples (2 second intervals sampled 128 Hz, 1 to 2 minutes eyes closed) from 43 normal adult subjects were imported into the Key Institute's LORETA program. We then used the Key Institute's cross-spectrum and the Key Institute's LORETA output files (*.lor) as the 2,394 gray matter pixel representation of 3-dimensional currents at different frequencies. The mean and standard deviation *.lor files were computed for each of the 2,394 gray matter pixels for each of the 43 subjects. Tests of Gaussianity and different transforms were computed in order to best approximate a normal distribution for each frequency and gray matter pixel. The relative sensitivity of parametric vs. non-parametric statistics were compared using a "leave-one-out" cross validation method in which individual normal subjects were withdrawn and then statistically classified as being either normal or abnormal based on the remaining subjects. Log10 transforms approximated Gaussian distribution in the range of 95% to 99% accuracy. Parametric Z score tests at P < .05 cross-validation demonstrated an average misclassification rate of approximately 4.25%, and range over the 2,394 gray matter pixels was 27.66% to 0.11%. At P < .01 parametric Z score cross-validation false positives were 0.26% and ranged from 6.65% to 0% false positives. The non-parametric Key Institute's t-max statistic at P < .05 had an average misclassification error rate of 7.64% and ranged from 43.37% to 0.04% false positives. The nonparametric t-max at P < .01 had an average misclassification rate of 6.67% and ranged from 41.34% to 0% false positives of the 2,394 gray matter pixels for any cross-validated normal subject. In conclusion, adequate approximation to Gaussian distribution and high cross-validation can be achieved by the Key Institute's LORETA programs by using a log10 transform and parametric statistics, and parametric normative comparisons had lower false positive rates than the non-parametric tests.
2016-05-31
and included explosives such as TATP, HMTD, RDX, RDX, ammonium nitrate , potassium perchlorate, potassium nitrate , sugar, and TNT. The approach...Distribution Unlimited UU UU UU UU 31-05-2016 15-Apr-2014 14-Jan-2015 Final Report: Technical Topic 3.2.2. d Bayesian and Non- parametric Statistics...of Papers published in non peer-reviewed journals: Final Report: Technical Topic 3.2.2. d Bayesian and Non-parametric Statistics: Integration of Neural
Scarpazza, Cristina; Nichols, Thomas E; Seramondi, Donato; Maumet, Camille; Sartori, Giuseppe; Mechelli, Andrea
2016-01-01
In recent years, an increasing number of studies have used Voxel Based Morphometry (VBM) to compare a single patient with a psychiatric or neurological condition of interest against a group of healthy controls. However, the validity of this approach critically relies on the assumption that the single patient is drawn from a hypothetical population with a normal distribution and variance equal to that of the control group. In a previous investigation, we demonstrated that family-wise false positive error rate (i.e., the proportion of statistical comparisons yielding at least one false positive) in single case VBM are much higher than expected (Scarpazza et al., 2013). Here, we examine whether the use of non-parametric statistics, which does not rely on the assumptions of normal distribution and equal variance, would enable the investigation of single subjects with good control of false positive risk. We empirically estimated false positive rates (FPRs) in single case non-parametric VBM, by performing 400 statistical comparisons between a single disease-free individual and a group of 100 disease-free controls. The impact of smoothing (4, 8, and 12 mm) and type of pre-processing (Modulated, Unmodulated) was also examined, as these factors have been found to influence FPRs in previous investigations using parametric statistics. The 400 statistical comparisons were repeated using two independent, freely available data sets in order to maximize the generalizability of the results. We found that the family-wise error rate was 5% for increases and 3.6% for decreases in one data set; and 5.6% for increases and 6.3% for decreases in the other data set (5% nominal). Further, these results were not dependent on the level of smoothing and modulation. Therefore, the present study provides empirical evidence that single case VBM studies with non-parametric statistics are not susceptible to high false positive rates. The critical implication of this finding is that VBM can be used to characterize neuroanatomical alterations in individual subjects as long as non-parametric statistics are employed.
O'Sullivan, Finbarr; Muzi, Mark; Spence, Alexander M; Mankoff, David M; O'Sullivan, Janet N; Fitzgerald, Niall; Newman, George C; Krohn, Kenneth A
2009-06-01
Kinetic analysis is used to extract metabolic information from dynamic positron emission tomography (PET) uptake data. The theory of indicator dilutions, developed in the seminal work of Meier and Zierler (1954), provides a probabilistic framework for representation of PET tracer uptake data in terms of a convolution between an arterial input function and a tissue residue. The residue is a scaled survival function associated with tracer residence in the tissue. Nonparametric inference for the residue, a deconvolution problem, provides a novel approach to kinetic analysis-critically one that is not reliant on specific compartmental modeling assumptions. A practical computational technique based on regularized cubic B-spline approximation of the residence time distribution is proposed. Nonparametric residue analysis allows formal statistical evaluation of specific parametric models to be considered. This analysis needs to properly account for the increased flexibility of the nonparametric estimator. The methodology is illustrated using data from a series of cerebral studies with PET and fluorodeoxyglucose (FDG) in normal subjects. Comparisons are made between key functionals of the residue, tracer flux, flow, etc., resulting from a parametric (the standard two-compartment of Phelps et al. 1979) and a nonparametric analysis. Strong statistical evidence against the compartment model is found. Primarily these differences relate to the representation of the early temporal structure of the tracer residence-largely a function of the vascular supply network. There are convincing physiological arguments against the representations implied by the compartmental approach but this is the first time that a rigorous statistical confirmation using PET data has been reported. The compartmental analysis produces suspect values for flow but, notably, the impact on the metabolic flux, though statistically significant, is limited to deviations on the order of 3%-4%. The general advantage of the nonparametric residue analysis is the ability to provide a valid kinetic quantitation in the context of studies where there may be heterogeneity or other uncertainty about the accuracy of a compartmental model approximation of the tissue residue.
Analysis of Parasite and Other Skewed Counts
Alexander, Neal
2012-01-01
Objective To review methods for the statistical analysis of parasite and other skewed count data. Methods Statistical methods for skewed count data are described and compared, with reference to those used over a ten year period of Tropical Medicine and International Health. Two parasitological datasets are used for illustration. Results Ninety papers were identified, 89 with descriptive and 60 with inferential analysis. A lack of clarity is noted in identifying measures of location, in particular the Williams and geometric mean. The different measures are compared, emphasizing the legitimacy of the arithmetic mean for skewed data. In the published papers, the t test and related methods were often used on untransformed data, which is likely to be invalid. Several approaches to inferential analysis are described, emphasizing 1) non-parametric methods, while noting that they are not simply comparisons of medians, and 2) generalized linear modelling, in particular with the negative binomial distribution. Additional methods, such as the bootstrap, with potential for greater use are described. Conclusions Clarity is recommended when describing transformations and measures of location. It is suggested that non-parametric methods and generalized linear models are likely to be sufficient for most analyses. PMID:22943299
Howard, Réka; Carriquiry, Alicia L.; Beavis, William D.
2014-01-01
Parametric and nonparametric methods have been developed for purposes of predicting phenotypes. These methods are based on retrospective analyses of empirical data consisting of genotypic and phenotypic scores. Recent reports have indicated that parametric methods are unable to predict phenotypes of traits with known epistatic genetic architectures. Herein, we review parametric methods including least squares regression, ridge regression, Bayesian ridge regression, least absolute shrinkage and selection operator (LASSO), Bayesian LASSO, best linear unbiased prediction (BLUP), Bayes A, Bayes B, Bayes C, and Bayes Cπ. We also review nonparametric methods including Nadaraya-Watson estimator, reproducing kernel Hilbert space, support vector machine regression, and neural networks. We assess the relative merits of these 14 methods in terms of accuracy and mean squared error (MSE) using simulated genetic architectures consisting of completely additive or two-way epistatic interactions in an F2 population derived from crosses of inbred lines. Each simulated genetic architecture explained either 30% or 70% of the phenotypic variability. The greatest impact on estimates of accuracy and MSE was due to genetic architecture. Parametric methods were unable to predict phenotypic values when the underlying genetic architecture was based entirely on epistasis. Parametric methods were slightly better than nonparametric methods for additive genetic architectures. Distinctions among parametric methods for additive genetic architectures were incremental. Heritability, i.e., proportion of phenotypic variability, had the second greatest impact on estimates of accuracy and MSE. PMID:24727289
TRAN-STAT: statistics for environmental transuranic studies, July 1978, Number 5
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
This issue is concerned with nonparametric procedures for (1) estimating the central tendency of a population, (2) describing data sets through estimating percentiles, (3) estimating confidence limits for the median and other percentiles, (4) estimating tolerance limits and associated numbers of samples, and (5) tests of significance and associated procedures for a variety of testing situations (counterparts to t-tests and analysis of variance). Some characteristics of several nonparametric tests are illustrated using the NAEG /sup 241/Am aliquot data presented and discussed in the April issue of TRAN-STAT. Some of the statistical terms used here are defined in a glossary. Themore » reference list also includes short descriptions of nonparametric books. 31 references, 3 figures, 1 table.« less
Nixon, Richard M; Wonderling, David; Grieve, Richard D
2010-03-01
Cost-effectiveness analyses (CEA) alongside randomised controlled trials commonly estimate incremental net benefits (INB), with 95% confidence intervals, and compute cost-effectiveness acceptability curves and confidence ellipses. Two alternative non-parametric methods for estimating INB are to apply the central limit theorem (CLT) or to use the non-parametric bootstrap method, although it is unclear which method is preferable. This paper describes the statistical rationale underlying each of these methods and illustrates their application with a trial-based CEA. It compares the sampling uncertainty from using either technique in a Monte Carlo simulation. The experiments are repeated varying the sample size and the skewness of costs in the population. The results showed that, even when data were highly skewed, both methods accurately estimated the true standard errors (SEs) when sample sizes were moderate to large (n>50), and also gave good estimates for small data sets with low skewness. However, when sample sizes were relatively small and the data highly skewed, using the CLT rather than the bootstrap led to slightly more accurate SEs. We conclude that while in general using either method is appropriate, the CLT is easier to implement, and provides SEs that are at least as accurate as the bootstrap. (c) 2009 John Wiley & Sons, Ltd.
Detection of semi-volatile organic compounds in permeable ...
Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
EEG Correlates of Fluctuation in Cognitive Performance in an Air Traffic Control Task
2014-11-01
using non-parametric statistical analysis to identify neurophysiological patterns due to the time-on-task effect. Significant changes in EEG power...EEG, Cognitive Performance, Power Spectral Analysis , Non-Parametric Analysis Document is available to the public through the Internet...3 Performance Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 EEG
USDA-ARS?s Scientific Manuscript database
Parametric non-linear regression (PNR) techniques commonly are used to develop weed seedling emergence models. Such techniques, however, require statistical assumptions that are difficult to meet. To examine and overcome these limitations, we compared PNR with a nonparametric estimation technique. F...
Chaibub Neto, Elias
2015-01-01
In this paper we propose a vectorized implementation of the non-parametric bootstrap for statistics based on sample moments. Basically, we adopt the multinomial sampling formulation of the non-parametric bootstrap, and compute bootstrap replications of sample moment statistics by simply weighting the observed data according to multinomial counts instead of evaluating the statistic on a resampled version of the observed data. Using this formulation we can generate a matrix of bootstrap weights and compute the entire vector of bootstrap replications with a few matrix multiplications. Vectorization is particularly important for matrix-oriented programming languages such as R, where matrix/vector calculations tend to be faster than scalar operations implemented in a loop. We illustrate the application of the vectorized implementation in real and simulated data sets, when bootstrapping Pearson’s sample correlation coefficient, and compared its performance against two state-of-the-art R implementations of the non-parametric bootstrap, as well as a straightforward one based on a for loop. Our investigations spanned varying sample sizes and number of bootstrap replications. The vectorized bootstrap compared favorably against the state-of-the-art implementations in all cases tested, and was remarkably/considerably faster for small/moderate sample sizes. The same results were observed in the comparison with the straightforward implementation, except for large sample sizes, where the vectorized bootstrap was slightly slower than the straightforward implementation due to increased time expenditures in the generation of weight matrices via multinomial sampling. PMID:26125965
Quintela-del-Río, Alejandro; Francisco-Fernández, Mario
2011-02-01
The study of extreme values and prediction of ozone data is an important topic of research when dealing with environmental problems. Classical extreme value theory is usually used in air-pollution studies. It consists in fitting a parametric generalised extreme value (GEV) distribution to a data set of extreme values, and using the estimated distribution to compute return levels and other quantities of interest. Here, we propose to estimate these values using nonparametric functional data methods. Functional data analysis is a relatively new statistical methodology that generally deals with data consisting of curves or multi-dimensional variables. In this paper, we use this technique, jointly with nonparametric curve estimation, to provide alternatives to the usual parametric statistical tools. The nonparametric estimators are applied to real samples of maximum ozone values obtained from several monitoring stations belonging to the Automatic Urban and Rural Network (AURN) in the UK. The results show that nonparametric estimators work satisfactorily, outperforming the behaviour of classical parametric estimators. Functional data analysis is also used to predict stratospheric ozone concentrations. We show an application, using the data set of mean monthly ozone concentrations in Arosa, Switzerland, and the results are compared with those obtained by classical time series (ARIMA) analysis. Copyright © 2010 Elsevier Ltd. All rights reserved.
A Nonparametric Framework for Comparing Trends and Gaps across Tests
ERIC Educational Resources Information Center
Ho, Andrew Dean
2009-01-01
Problems of scale typically arise when comparing test score trends, gaps, and gap trends across different tests. To overcome some of these difficulties, test score distributions on the same score scale can be represented by nonparametric graphs or statistics that are invariant under monotone scale transformations. This article motivates and then…
A Nonparametric K-Sample Test for Equality of Slopes.
ERIC Educational Resources Information Center
Penfield, Douglas A.; Koffler, Stephen L.
1986-01-01
The development of a nonparametric K-sample test for equality of slopes using Puri's generalized L statistic is presented. The test is recommended when the assumptions underlying the parametric model are violated. This procedure replaces original data with either ranks (for data with heavy tails) or normal scores (for data with light tails).…
A Note on the Assumption of Identical Distributions for Nonparametric Tests of Location
ERIC Educational Resources Information Center
Nordstokke, David W.; Colp, S. Mitchell
2018-01-01
Often, when testing for shift in location, researchers will utilize nonparametric statistical tests in place of their parametric counterparts when there is evidence or belief that the assumptions of the parametric test are not met (i.e., normally distributed dependent variables). An underlying and often unattended to assumption of nonparametric…
2007-01-01
Background The US Food and Drug Administration approved the Charité artificial disc on October 26, 2004. This approval was based on an extensive analysis and review process; 20 years of disc usage worldwide; and the results of a prospective, randomized, controlled clinical trial that compared lumbar artificial disc replacement to fusion. The results of the investigational device exemption (IDE) study led to a conclusion that clinical outcomes following lumbar arthroplasty were at least as good as outcomes from fusion. Methods The author performed a new analysis of the Visual Analog Scale pain scores and the Oswestry Disability Index scores from the Charité artificial disc IDE study and used a nonparametric statistical test, because observed data distributions were not normal. The analysis included all of the enrolled subjects in both the nonrandomized and randomized phases of the study. Results Subjects from both the treatment and control groups improved from the baseline situation (P < .001) at all follow-up times (6 weeks to 24 months). Additionally, these pain and disability levels with artificial disc replacement were superior (P < .05) to the fusion treatment at all follow-up times including 2 years. Conclusions The a priori statistical plan for an IDE study may not adequately address the final distribution of the data. Therefore, statistical analyses more appropriate to the distribution may be necessary to develop meaningful statistical conclusions from the study. A nonparametric statistical analysis of the Charité artificial disc IDE outcomes scores demonstrates superiority for lumbar arthroplasty versus fusion at all follow-up time points to 24 months. PMID:25802574
Nonparametric method for failures diagnosis in the actuating subsystem of aircraft control system
NASA Astrophysics Data System (ADS)
Terentev, M. N.; Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.
2018-02-01
In this paper we design a nonparametric method for failures diagnosis in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on analytical nonparametric one-step-ahead state prediction approach. This makes it possible to predict the behavior of unidentified and failure dynamic systems, to weaken the requirements to control signals, and to reduce the diagnostic time and problem complexity.
Lee, L.; Helsel, D.
2007-01-01
Analysis of low concentrations of trace contaminants in environmental media often results in left-censored data that are below some limit of analytical precision. Interpretation of values becomes complicated when there are multiple detection limits in the data-perhaps as a result of changing analytical precision over time. Parametric and semi-parametric methods, such as maximum likelihood estimation and robust regression on order statistics, can be employed to model distributions of multiply censored data and provide estimates of summary statistics. However, these methods are based on assumptions about the underlying distribution of data. Nonparametric methods provide an alternative that does not require such assumptions. A standard nonparametric method for estimating summary statistics of multiply-censored data is the Kaplan-Meier (K-M) method. This method has seen widespread usage in the medical sciences within a general framework termed "survival analysis" where it is employed with right-censored time-to-failure data. However, K-M methods are equally valid for the left-censored data common in the geosciences. Our S-language software provides an analytical framework based on K-M methods that is tailored to the needs of the earth and environmental sciences community. This includes routines for the generation of empirical cumulative distribution functions, prediction or exceedance probabilities, and related confidence limits computation. Additionally, our software contains K-M-based routines for nonparametric hypothesis testing among an unlimited number of grouping variables. A primary characteristic of K-M methods is that they do not perform extrapolation and interpolation. Thus, these routines cannot be used to model statistics beyond the observed data range or when linear interpolation is desired. For such applications, the aforementioned parametric and semi-parametric methods must be used.
Statistics Anxiety and Business Statistics: The International Student
ERIC Educational Resources Information Center
Bell, James A.
2008-01-01
Does the international student suffer from statistics anxiety? To investigate this, the Statistics Anxiety Rating Scale (STARS) was administered to sixty-six beginning statistics students, including twelve international students and fifty-four domestic students. Due to the small number of international students, nonparametric methods were used to…
Investigation of a Nonparametric Procedure for Assessing Goodness-of-Fit in Item Response Theory
ERIC Educational Resources Information Center
Wells, Craig S.; Bolt, Daniel M.
2008-01-01
Tests of model misfit are often performed to validate the use of a particular model in item response theory. Douglas and Cohen (2001) introduced a general nonparametric approach for detecting misfit under the two-parameter logistic model. However, the statistical properties of their approach, and empirical comparisons to other methods, have not…
A Powerful Test for Comparing Multiple Regression Functions.
Maity, Arnab
2012-09-01
In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Estimating population diversity with CatchAll
Bunge, John; Woodard, Linda; Böhning, Dankmar; Foster, James A.; Connolly, Sean; Allen, Heather K.
2012-01-01
Motivation: The massive data produced by next-generation sequencing require advanced statistical tools. We address estimating the total diversity or species richness in a population. To date, only relatively simple methods have been implemented in available software. There is a need for software employing modern, computationally intensive statistical analyses including error, goodness-of-fit and robustness assessments. Results: We present CatchAll, a fast, easy-to-use, platform-independent program that computes maximum likelihood estimates for finite-mixture models, weighted linear regression-based analyses and coverage-based non-parametric methods, along with outlier diagnostics. Given sample ‘frequency count’ data, CatchAll computes 12 different diversity estimates and applies a model-selection algorithm. CatchAll also derives discounted diversity estimates to adjust for possibly uncertain low-frequency counts. It is accompanied by an Excel-based graphics program. Availability: Free executable downloads for Linux, Windows and Mac OS, with manual and source code, at www.northeastern.edu/catchall. Contact: jab18@cornell.edu PMID:22333246
Parasites as valuable stock markers for fisheries in Australasia, East Asia and the Pacific Islands.
Lester, R J G; Moore, B R
2015-01-01
Over 30 studies in Australasia, East Asia and the Pacific Islands region have collected and analysed parasite data to determine the ranges of individual fish, many leading to conclusions about stock delineation. Parasites used as biological tags have included both those known to have long residence times in the fish and those thought to be relatively transient. In many cases the parasitological conclusions have been supported by other methods especially analysis of the chemical constituents of otoliths, and to a lesser extent, genetic data. In analysing parasite data, authors have applied multiple different statistical methodologies, including summary statistics, and univariate and multivariate approaches. Recently, a growing number of researchers have found non-parametric methods, such as analysis of similarities and cluster analysis, to be valuable. Future studies into the residence times, life cycles and geographical distributions of parasites together with more robust analytical methods will yield much important information to clarify stock structures in the area.
Lucyshyn, Joseph M; Fossett, Brenda; Bakeman, Roger; Cheremshynski, Christy; Miller, Lynn; Lohrmann, Sharon; Binnendyk, Lauren; Khan, Sophia; Chinn, Stephen; Kwon, Samantha; Irvin, Larry K
2015-12-01
The efficacy and consequential validity of an ecological approach to behavioral intervention with families of children with developmental disabilities was examined. The approach aimed to transform coercive into constructive parent-child interaction in family routines. Ten families participated, including 10 mothers and fathers and 10 children 3-8 years old with developmental disabilities. Thirty-six family routines were selected (2 to 4 per family). Dependent measures included child problem behavior, routine steps completed, and coercive and constructive parent-child interaction. For each family, a single case, multiple baseline design was employed with three phases: baseline, intervention, and follow-up. Visual analysis evaluated the functional relation between intervention and improvements in child behavior and routine participation. Nonparametric tests across families evaluated the statistical significance of these improvements. Sequential analyses within families and univariate analyses across families examined changes from baseline to intervention in the percentage and odds ratio of coercive and constructive parent-child interaction. Multiple baseline results documented functional or basic effects for 8 of 10 families. Nonparametric tests showed these changes to be significant. Follow-up showed durability at 11 to 24 months postintervention. Sequential analyses documented the transformation of coercive into constructive processes for 9 of 10 families. Univariate analyses across families showed significant improvements in 2- and 4-step coercive and constructive processes but not in odds ratio. Results offer evidence of the efficacy of the approach and consequential validity of the ecological unit of analysis, parent-child interaction in family routines. Future studies should improve efficiency, and outcomes for families experiencing family systems challenges.
Nonparametric Statistics Test Software Package.
1983-09-01
statis- tics because of their acceptance in the academic world, the availability of computer support, and flexibility in model builling. Nonparametric...25 I1l,lCELL WRITE(NCF,12 ) IvE (I ,RCCT(I) 122 FORMAT(IlXt 3(H5 9 1) IF( IeLT *NCELL) WRITE (NOF1123 J PARTV(I1J 123 FORMAT( Xll----’,FIo.3J 25 CONT
John Hof; Curtis Flather; Tony Baltic; Rudy King
2006-01-01
The 2005 Forest and Rangeland Condition Indicator Model is a set of classification trees for forest and rangeland condition indicators at the national scale. This report documents the development of the database and the nonparametric statistical estimation for this analytical structure, with emphasis on three special characteristics of condition indicator production...
ERIC Educational Resources Information Center
Bakir, Saad T.
2010-01-01
We propose a nonparametric (or distribution-free) procedure for testing the equality of several population variances (or scale parameters). The proposed test is a modification of Bakir's (1989, Commun. Statist., Simul-Comp., 18, 757-775) analysis of means by ranks (ANOMR) procedure for testing the equality of several population means. A proof is…
Bayesian Unimodal Density Regression for Causal Inference
ERIC Educational Resources Information Center
Karabatsos, George; Walker, Stephen G.
2011-01-01
Karabatsos and Walker (2011) introduced a new Bayesian nonparametric (BNP) regression model. Through analyses of real and simulated data, they showed that the BNP regression model outperforms other parametric and nonparametric regression models of common use, in terms of predictive accuracy of the outcome (dependent) variable. The other,…
An ANOVA approach for statistical comparisons of brain networks.
Fraiman, Daniel; Fraiman, Ricardo
2018-03-16
The study of brain networks has developed extensively over the last couple of decades. By contrast, techniques for the statistical analysis of these networks are less developed. In this paper, we focus on the statistical comparison of brain networks in a nonparametric framework and discuss the associated detection and identification problems. We tested network differences between groups with an analysis of variance (ANOVA) test we developed specifically for networks. We also propose and analyse the behaviour of a new statistical procedure designed to identify different subnetworks. As an example, we show the application of this tool in resting-state fMRI data obtained from the Human Connectome Project. We identify, among other variables, that the amount of sleep the days before the scan is a relevant variable that must be controlled. Finally, we discuss the potential bias in neuroimaging findings that is generated by some behavioural and brain structure variables. Our method can also be applied to other kind of networks such as protein interaction networks, gene networks or social networks.
Estimating trends in the global mean temperature record
NASA Astrophysics Data System (ADS)
Poppick, Andrew; Moyer, Elisabeth J.; Stein, Michael L.
2017-06-01
Given uncertainties in physical theory and numerical climate simulations, the historical temperature record is often used as a source of empirical information about climate change. Many historical trend analyses appear to de-emphasize physical and statistical assumptions: examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for internal variability in nonparametric rather than parametric ways. However, given a limited data record and the presence of internal variability, estimating radiatively forced temperature trends in the historical record necessarily requires some assumptions. Ostensibly empirical methods can also involve an inherent conflict in assumptions: they require data records that are short enough for naive trend models to be applicable, but long enough for long-timescale internal variability to be accounted for. In the context of global mean temperatures, empirical methods that appear to de-emphasize assumptions can therefore produce misleading inferences, because the trend over the twentieth century is complex and the scale of temporal correlation is long relative to the length of the data record. We illustrate here how a simple but physically motivated trend model can provide better-fitting and more broadly applicable trend estimates and can allow for a wider array of questions to be addressed. In particular, the model allows one to distinguish, within a single statistical framework, between uncertainties in the shorter-term vs. longer-term response to radiative forcing, with implications not only on historical trends but also on uncertainties in future projections. We also investigate the consequence on inferred uncertainties of the choice of a statistical description of internal variability. While nonparametric methods may seem to avoid making explicit assumptions, we demonstrate how even misspecified parametric statistical methods, if attuned to the important characteristics of internal variability, can result in more accurate uncertainty statements about trends.
STATISTICAL ESTIMATION AND VISUALIZATION OF GROUND-WATER CONTAMINATION DATA
This work presents methods of visualizing and animating statistical estimates of ground water and/or soil contamination over a region from observations of the contaminant for that region. The primary statistical methods used to produce the regional estimates are nonparametric re...
Barbie, Dana L.; Wehmeyer, Loren L.
2012-01-01
Trends in selected streamflow statistics during 1922-2009 were evaluated at 19 long-term streamflow-gaging stations considered indicative of outflows from Texas to Arkansas, Louisiana, Galveston Bay, and the Gulf of Mexico. The U.S. Geological Survey, in cooperation with the Texas Water Development Board, evaluated streamflow data from streamflow-gaging stations with more than 50 years of record that were active as of 2009. The outflows into Arkansas and Louisiana were represented by 3 streamflow-gaging stations, and outflows into the Gulf of Mexico, including Galveston Bay, were represented by 16 streamflow-gaging stations. Monotonic trend analyses were done using the following three streamflow statistics generated from daily mean values of streamflow: (1) annual mean daily discharge, (2) annual maximum daily discharge, and (3) annual minimum daily discharge. The trend analyses were based on the nonparametric Kendall's Tau test, which is useful for the detection of monotonic upward or downward trends with time. A total of 69 trend analyses by Kendall's Tau were computed - 19 periods of streamflow multiplied by the 3 streamflow statistics plus 12 additional trend analyses because the periods of record for 2 streamflow-gaging stations were divided into periods representing pre- and post-reservoir impoundment. Unless otherwise described, each trend analysis used the entire period of record for each streamflow-gaging station. The monotonic trend analysis detected 11 statistically significant downward trends, 37 instances of no trend, and 21 statistically significant upward trends. One general region studied, which seemingly has relatively more upward trends for many of the streamflow statistics analyzed, includes the rivers and associated creeks and bayous to Galveston Bay in the Houston metropolitan area. Lastly, the most western river basins considered (the Nueces and Rio Grande) had statistically significant downward trends for many of the streamflow statistics analyzed.
Goudriaan, Marije; Van den Hauwe, Marleen; Simon-Martinez, Cristina; Huenaerts, Catherine; Molenaers, Guy; Goemans, Nathalie; Desloovere, Kaat
2018-04-30
Prolonged ambulation is considered important in children with Duchenne muscular dystrophy (DMD). However, previous studies analyzing DMD gait were sensitive to false positive outcomes, caused by uncorrected multiple comparisons, regional focus bias, and inter-component covariance bias. Also, while muscle weakness is often suggested to be the main cause for the altered gait pattern in DMD, this was never verified. Our research question was twofold: 1) are we able to confirm the sagittal kinematic and kinetic gait alterations described in a previous review with statistical non-parametric mapping (SnPM)? And 2) are these gait deviations related to lower limb weakness? We compared gait kinematics and kinetics of 15 children with DMD and 15 typical developing (TD) children (5-17 years), with a two sample Hotelling's T 2 test and post-hoc two-tailed, two-sample t-test. We used canonical correlation analyses to study the relationship between weakness and altered gait parameters. For all analyses, α-level was corrected for multiple comparisons, resulting in α = 0.005. We only found one of the previously reported kinematic deviations: the children with DMD had an increased knee flexion angle during swing (p = 0.0006). Observed gait deviations that were not reported in the review were an increased hip flexion angle during stance (p = 0.0009) and swing (p = 0.0001), altered combined knee and ankle torques (p = 0.0002), and decreased power absorption during stance (p = 0.0001). No relationships between weakness and these gait deviations were found. We were not able to replicate the gait deviations in DMD previously reported in literature, thus DMD gait remains undefined. Further, weakness does not seem to be linearly related to altered gait features. The progressive nature of the disease requires larger study populations and longitudinal analyses to gain more insight into DMD gait and its underlying causes. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Fernández-Llamazares, Álvaro; Belmonte, Jordina; Delgado, Rosario; De Linares, Concepción
2014-04-01
Airborne pollen records are a suitable indicator for the study of climate change. The present work focuses on the role of annual pollen indices for the detection of bioclimatic trends through the analysis of the aerobiological spectra of 11 taxa of great biogeographical relevance in Catalonia over an 18-year period (1994-2011), by means of different parametric and non-parametric statistical methods. Among others, two non-parametric rank-based statistical tests were performed for detecting monotonic trends in time series data of the selected airborne pollen types and we have observed that they have similar power in detecting trends. Except for those cases in which the pollen data can be well-modeled by a normal distribution, it is better to apply non-parametric statistical methods to aerobiological studies. Our results provide a reliable representation of the pollen trends in the region and suggest that greater pollen quantities are being liberated to the atmosphere in the last years, specially by Mediterranean taxa such as Pinus, Total Quercus and Evergreen Quercus, although the trends may differ geographically. Longer aerobiological monitoring periods are required to corroborate these results and survey the increasing levels of certain pollen types that could exert an impact in terms of public health.
A nonparametric analysis of plot basal area growth using tree based models
G. L. Gadbury; H. K. lyer; H. T. Schreuder; C. Y. Ueng
1997-01-01
Tree based statistical models can be used to investigate data structure and predict future observations. We used nonparametric and nonlinear models to reexamine the data sets on tree growth used by Bechtold et al. (1991) and Ruark et al. (1991). The growth data were collected by Forest Inventory and Analysis (FIA) teams from 1962 to 1972 (4th cycle) and 1972 to 1982 (...
ERIC Educational Resources Information Center
Sinharay, Sandip
2017-01-01
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
A weighted U-statistic for genetic association analyses of sequencing data.
Wei, Changshuai; Li, Ming; He, Zihuai; Vsevolozhskaya, Olga; Schaid, Daniel J; Lu, Qing
2014-12-01
With advancements in next-generation sequencing technology, a massive amount of sequencing data is generated, which offers a great opportunity to comprehensively investigate the role of rare variants in the genetic etiology of complex diseases. Nevertheless, the high-dimensional sequencing data poses a great challenge for statistical analysis. The association analyses based on traditional statistical methods suffer substantial power loss because of the low frequency of genetic variants and the extremely high dimensionality of the data. We developed a Weighted U Sequencing test, referred to as WU-SEQ, for the high-dimensional association analysis of sequencing data. Based on a nonparametric U-statistic, WU-SEQ makes no assumption of the underlying disease model and phenotype distribution, and can be applied to a variety of phenotypes. Through simulation studies and an empirical study, we showed that WU-SEQ outperformed a commonly used sequence kernel association test (SKAT) method when the underlying assumptions were violated (e.g., the phenotype followed a heavy-tailed distribution). Even when the assumptions were satisfied, WU-SEQ still attained comparable performance to SKAT. Finally, we applied WU-SEQ to sequencing data from the Dallas Heart Study (DHS), and detected an association between ANGPTL 4 and very low density lipoprotein cholesterol. © 2014 WILEY PERIODICALS, INC.
Cluster mass inference via random field theory.
Zhang, Hui; Nichols, Thomas E; Johnson, Timothy D
2009-01-01
Cluster extent and voxel intensity are two widely used statistics in neuroimaging inference. Cluster extent is sensitive to spatially extended signals while voxel intensity is better for intense but focal signals. In order to leverage strength from both statistics, several nonparametric permutation methods have been proposed to combine the two methods. Simulation studies have shown that of the different cluster permutation methods, the cluster mass statistic is generally the best. However, to date, there is no parametric cluster mass inference available. In this paper, we propose a cluster mass inference method based on random field theory (RFT). We develop this method for Gaussian images, evaluate it on Gaussian and Gaussianized t-statistic images and investigate its statistical properties via simulation studies and real data. Simulation results show that the method is valid under the null hypothesis and demonstrate that it can be more powerful than the cluster extent inference method. Further, analyses with a single subject and a group fMRI dataset demonstrate better power than traditional cluster size inference, and good accuracy relative to a gold-standard permutation test.
Associations between host characteristics and antimicrobial resistance of Salmonella typhimurium.
Ruddat, I; Tietze, E; Ziehm, D; Kreienbrock, L
2014-10-01
A collection of Salmonella Typhimurium isolates obtained from sporadic salmonellosis cases in humans from Lower Saxony, Germany between June 2008 and May 2010 was used to perform an exploratory risk-factor analysis on antimicrobial resistance (AMR) using comprehensive host information on sociodemographic attributes, medical history, food habits and animal contact. Multivariate resistance profiles of minimum inhibitory concentrations for 13 antimicrobial agents were analysed using a non-parametric approach with multifactorial models adjusted for phage types. Statistically significant associations were observed for consumption of antimicrobial agents, region type and three factors on egg-purchasing behaviour, indicating that besides antimicrobial use the proximity to other community members, health consciousness and other lifestyle-related attributes may play a role in the dissemination of resistances. Furthermore, a statistically significant increase in AMR from the first study year to the second year was observed.
Lie, Stein Atle; Eriksen, Hege R; Ursin, Holger; Hagen, Eli Molde
2008-05-01
Analysing and presenting data on different outcomes after sick-leave is challenging. The use of extended statistical methods supplies additional information and allows further exploitation of data. Four hundred and fifty-seven patients, sick-listed for 8-12 weeks for low back pain, were randomized to intervention (n=237) or control (n=220). Outcome was measured as: "sick-listed'', "returned to work'', or "disability pension''. The individuals shifted between the three states between one and 22 times (mean 6.4 times). In a multi-state model, shifting between the states was set up in a transition intensity matrix. The probability of being in any of the states was calculated as a transition probability matrix. The effects of the intervention were modelled using a non-parametric model. There was an effect of the intervention for leaving the state sick-listed and shifting to returned to work (relative risk (RR)=1.27, 95% confidence interval (CI) 1.09- 1.47). The nonparametric estimates showed an effect of the intervention for leaving sick-listed and shifting to returned to work in the first 6 months. We found a protective effect of the intervention for shifting back to sick-listed between 6 and 18 months. The analyses showed that the probability of staying in the state returned to work was not different between the intervention and control groups at the end of the follow-up (3 years). We demonstrate that these alternative analyses give additional results and increase the strength of the analyses. The simple intervention did not decrease the probability of being on sick-leave in the long term; however, it decreased the time that individuals were on sick-leave.
The Probability of Exceedance as a Nonparametric Person-Fit Statistic for Tests of Moderate Length
ERIC Educational Resources Information Center
Tendeiro, Jorge N.; Meijer, Rob R.
2013-01-01
To classify an item score pattern as not fitting a nonparametric item response theory (NIRT) model, the probability of exceedance (PE) of an observed response vector x can be determined as the sum of the probabilities of all response vectors that are, at most, as likely as x, conditional on the test's total score. Vector x is to be considered…
ERIC Educational Resources Information Center
Sengul Avsar, Asiye; Tavsancil, Ezel
2017-01-01
This study analysed polytomous items' psychometric properties according to nonparametric item response theory (NIRT) models. Thus, simulated datasets--three different test lengths (10, 20 and 30 items), three sample distributions (normal, right and left skewed) and three samples sizes (100, 250 and 500)--were generated by conducting 20…
Sarkar, Rajarshi
2013-07-01
The validity of the entire renal function tests as a diagnostic tool depends substantially on the Biological Reference Interval (BRI) of urea. Establishment of BRI of urea is difficult partly because exclusion criteria for selection of reference data are quite rigid and partly due to the compartmentalization considerations regarding age and sex of the reference individuals. Moreover, construction of Biological Reference Curve (BRC) of urea is imperative to highlight the partitioning requirements. This a priori study examines the data collected by measuring serum urea of 3202 age and sex matched individuals, aged between 1 and 80 years, by a kinetic UV Urease/GLDH method on a Roche Cobas 6000 auto-analyzer. Mann-Whitney U test of the reference data confirmed the partitioning requirement by both age and sex. Further statistical analysis revealed the incompatibility of the data for a proposed parametric model. Hence the data was non-parametrically analysed. BRI was found to be identical for both sexes till the 2(nd) decade, and the BRI for males increased progressively 6(th) decade onwards. Four non-parametric models were postulated for construction of BRC: Gaussian kernel, double kernel, local mean and local constant, of which the last one generated the best-fitting curves. Clinical decision making should become easier and diagnostic implications of renal function tests should become more meaningful if this BRI is followed and the BRC is used as a desktop tool in conjunction with similar data for serum creatinine.
The chi-square test of independence.
McHugh, Mary L
2013-01-01
The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others. The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer's V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer's V to produce relative low correlation measures, even for highly significant results.
Donnelly, Aoife; Misstear, Bruce; Broderick, Brian
2011-02-15
Background concentrations of nitrogen dioxide (NO(2)) are not constant but vary temporally and spatially. The current paper presents a powerful tool for the quantification of the effects of wind direction and wind speed on background NO(2) concentrations, particularly in cases where monitoring data are limited. In contrast to previous studies which applied similar methods to sites directly affected by local pollution sources, the current study focuses on background sites with the aim of improving methods for predicting background concentrations adopted in air quality modelling studies. The relationship between measured NO(2) concentration in air at three such sites in Ireland and locally measured wind direction has been quantified using nonparametric regression methods. The major aim was to analyse a method for quantifying the effects of local wind direction on background levels of NO(2) in Ireland. The method was expanded to include wind speed as an added predictor variable. A Gaussian kernel function is used in the analysis and circular statistics employed for the wind direction variable. Wind direction and wind speed were both found to have a statistically significant effect on background levels of NO(2) at all three sites. Frequently environmental impact assessments are based on short term baseline monitoring producing a limited dataset. The presented non-parametric regression methods, in contrast to the frequently used methods such as binning of the data, allow concentrations for missing data pairs to be estimated and distinction between spurious and true peaks in concentrations to be made. The methods were found to provide a realistic estimation of long term concentration variation with wind direction and speed, even for cases where the data set is limited. Accurate identification of the actual variation at each location and causative factors could be made, thus supporting the improved definition of background concentrations for use in air quality modelling studies. Copyright © 2010 Elsevier B.V. All rights reserved.
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
NASA Astrophysics Data System (ADS)
Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar
2015-06-01
In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.
Tempo-spatial analysis of Fennoscandian intraplate seismicity
NASA Astrophysics Data System (ADS)
Roberts, Roland; Lund, Björn
2017-04-01
Coupled spatial-temporal patterns of the occurrence of earthquakes in Fennoscandia are analysed using non-parametric methods. The occurrence of larger events is unambiguously and very strongly temporally clustered, with major implications for the assessment of seismic hazard in areas such as Fennoscandia. In addition, there is a clear pattern of geographical migration of activity. Data from the Swedish National Seismic Network and a collated international catalogue are analysed. Results show consistent patterns on different spatial and temporal scales. We are currently investigating these patterns in order to assess the statistical significance of the tempo-spatial patterns, and to what extent these may be consistent with stress transfer mechanism such as coulomb stress and pore fluid migration. Indications are that some further mechanism is necessary in order to explain the data, perhaps related to post-glacial uplift, which is up to 1cm/year.
NASA Astrophysics Data System (ADS)
Feng, Jinchao; Lansford, Joshua; Mironenko, Alexander; Pourkargar, Davood Babaei; Vlachos, Dionisios G.; Katsoulakis, Markos A.
2018-03-01
We propose non-parametric methods for both local and global sensitivity analysis of chemical reaction models with correlated parameter dependencies. The developed mathematical and statistical tools are applied to a benchmark Langmuir competitive adsorption model on a close packed platinum surface, whose parameters, estimated from quantum-scale computations, are correlated and are limited in size (small data). The proposed mathematical methodology employs gradient-based methods to compute sensitivity indices. We observe that ranking influential parameters depends critically on whether or not correlations between parameters are taken into account. The impact of uncertainty in the correlation and the necessity of the proposed non-parametric perspective are demonstrated.
NASA Astrophysics Data System (ADS)
Sumantari, Y. D.; Slamet, I.; Sugiyanto
2017-06-01
Semiparametric regression is a statistical analysis method that consists of parametric and nonparametric regression. There are various approach techniques in nonparametric regression. One of the approach techniques is spline. Central Java is one of the most densely populated province in Indonesia. Population density in this province can be modeled by semiparametric regression because it consists of parametric and nonparametric component. Therefore, the purpose of this paper is to determine the factors that in uence population density in Central Java using the semiparametric spline regression model. The result shows that the factors which in uence population density in Central Java is Family Planning (FP) active participants and district minimum wage.
Emura, Takeshi; Konno, Yoshihiko; Michimae, Hirofumi
2015-07-01
Doubly truncated data consist of samples whose observed values fall between the right- and left- truncation limits. With such samples, the distribution function of interest is estimated using the nonparametric maximum likelihood estimator (NPMLE) that is obtained through a self-consistency algorithm. Owing to the complicated asymptotic distribution of the NPMLE, the bootstrap method has been suggested for statistical inference. This paper proposes a closed-form estimator for the asymptotic covariance function of the NPMLE, which is computationally attractive alternative to bootstrapping. Furthermore, we develop various statistical inference procedures, such as confidence interval, goodness-of-fit tests, and confidence bands to demonstrate the usefulness of the proposed covariance estimator. Simulations are performed to compare the proposed method with both the bootstrap and jackknife methods. The methods are illustrated using the childhood cancer dataset.
Schaid, Daniel J
2010-01-01
Measures of genomic similarity are the basis of many statistical analytic methods. We review the mathematical and statistical basis of similarity methods, particularly based on kernel methods. A kernel function converts information for a pair of subjects to a quantitative value representing either similarity (larger values meaning more similar) or distance (smaller values meaning more similar), with the requirement that it must create a positive semidefinite matrix when applied to all pairs of subjects. This review emphasizes the wide range of statistical methods and software that can be used when similarity is based on kernel methods, such as nonparametric regression, linear mixed models and generalized linear mixed models, hierarchical models, score statistics, and support vector machines. The mathematical rigor for these methods is summarized, as is the mathematical framework for making kernels. This review provides a framework to move from intuitive and heuristic approaches to define genomic similarities to more rigorous methods that can take advantage of powerful statistical modeling and existing software. A companion paper reviews novel approaches to creating kernels that might be useful for genomic analyses, providing insights with examples [1]. Copyright © 2010 S. Karger AG, Basel.
Dickie, David Alexander; Job, Dominic E.; Gonzalez, David Rodriguez; Shenkin, Susan D.; Wardlaw, Joanna M.
2015-01-01
Introduction Neurodegenerative disease diagnoses may be supported by the comparison of an individual patient’s brain magnetic resonance image (MRI) with a voxel-based atlas of normal brain MRI. Most current brain MRI atlases are of young to middle-aged adults and parametric, e.g., mean ±standard deviation (SD); these atlases require data to be Gaussian. Brain MRI data, e.g., grey matter (GM) proportion images, from normal older subjects are apparently not Gaussian. We created a nonparametric and a parametric atlas of the normal limits of GM proportions in older subjects and compared their classifications of GM proportions in Alzheimer’s disease (AD) patients. Methods Using publicly available brain MRI from 138 normal subjects and 138 subjects diagnosed with AD (all 55–90 years), we created: a mean ±SD atlas to estimate parametrically the percentile ranks and limits of normal ageing GM; and, separately, a nonparametric, rank order-based GM atlas from the same normal ageing subjects. GM images from AD patients were then classified with respect to each atlas to determine the effect statistical distributions had on classifications of proportions of GM in AD patients. Results The parametric atlas often defined the lower normal limit of the proportion of GM to be negative (which does not make sense physiologically as the lowest possible proportion is zero). Because of this, for approximately half of the AD subjects, 25–45% of voxels were classified as normal when compared to the parametric atlas; but were classified as abnormal when compared to the nonparametric atlas. These voxels were mainly concentrated in the frontal and occipital lobes. Discussion To our knowledge, we have presented the first nonparametric brain MRI atlas. In conditions where there is increasing variability in brain structure, such as in old age, nonparametric brain MRI atlases may represent the limits of normal brain structure more accurately than parametric approaches. Therefore, we conclude that the statistical method used for construction of brain MRI atlases should be selected taking into account the population and aim under study. Parametric methods are generally robust for defining central tendencies, e.g., means, of brain structure. Nonparametric methods are advisable when studying the limits of brain structure in ageing and neurodegenerative disease. PMID:26023913
Wang, Yunpeng; Thompson, Wesley K.; Schork, Andrew J.; Holland, Dominic; Chen, Chi-Hua; Bettella, Francesco; Desikan, Rahul S.; Li, Wen; Witoelar, Aree; Zuber, Verena; Devor, Anna; Nöthen, Markus M.; Rietschel, Marcella; Chen, Qiang; Werge, Thomas; Cichon, Sven; Weinberger, Daniel R.; Djurovic, Srdjan; O’Donovan, Michael; Visscher, Peter M.; Andreassen, Ole A.; Dale, Anders M.
2016-01-01
Most of the genetic architecture of schizophrenia (SCZ) has not yet been identified. Here, we apply a novel statistical algorithm called Covariate-Modulated Mixture Modeling (CM3), which incorporates auxiliary information (heterozygosity, total linkage disequilibrium, genomic annotations, pleiotropy) for each single nucleotide polymorphism (SNP) to enable more accurate estimation of replication probabilities, conditional on the observed test statistic (“z-score”) of the SNP. We use a multiple logistic regression on z-scores to combine information from auxiliary information to derive a “relative enrichment score” for each SNP. For each stratum of these relative enrichment scores, we obtain nonparametric estimates of posterior expected test statistics and replication probabilities as a function of discovery z-scores, using a resampling-based approach that repeatedly and randomly partitions meta-analysis sub-studies into training and replication samples. We fit a scale mixture of two Gaussians model to each stratum, obtaining parameter estimates that minimize the sum of squared differences of the scale-mixture model with the stratified nonparametric estimates. We apply this approach to the recent genome-wide association study (GWAS) of SCZ (n = 82,315), obtaining a good fit between the model-based and observed effect sizes and replication probabilities. We observed that SNPs with low enrichment scores replicate with a lower probability than SNPs with high enrichment scores even when both they are genome-wide significant (p < 5x10-8). There were 693 and 219 independent loci with model-based replication rates ≥80% and ≥90%, respectively. Compared to analyses not incorporating relative enrichment scores, CM3 increased out-of-sample yield for SNPs that replicate at a given rate. This demonstrates that replication probabilities can be more accurately estimated using prior enrichment information with CM3. PMID:26808560
Publication Bias in Meta-Analysis: Confidence Intervals for Rosenthal's Fail-Safe Number.
Fragkos, Konstantinos C; Tsagris, Michail; Frangos, Christos C
2014-01-01
The purpose of the present paper is to assess the efficacy of confidence intervals for Rosenthal's fail-safe number. Although Rosenthal's estimator is highly used by researchers, its statistical properties are largely unexplored. First of all, we developed statistical theory which allowed us to produce confidence intervals for Rosenthal's fail-safe number. This was produced by discerning whether the number of studies analysed in a meta-analysis is fixed or random. Each case produces different variance estimators. For a given number of studies and a given distribution, we provided five variance estimators. Confidence intervals are examined with a normal approximation and a nonparametric bootstrap. The accuracy of the different confidence interval estimates was then tested by methods of simulation under different distributional assumptions. The half normal distribution variance estimator has the best probability coverage. Finally, we provide a table of lower confidence intervals for Rosenthal's estimator.
Publication Bias in Meta-Analysis: Confidence Intervals for Rosenthal's Fail-Safe Number
Fragkos, Konstantinos C.; Tsagris, Michail; Frangos, Christos C.
2014-01-01
The purpose of the present paper is to assess the efficacy of confidence intervals for Rosenthal's fail-safe number. Although Rosenthal's estimator is highly used by researchers, its statistical properties are largely unexplored. First of all, we developed statistical theory which allowed us to produce confidence intervals for Rosenthal's fail-safe number. This was produced by discerning whether the number of studies analysed in a meta-analysis is fixed or random. Each case produces different variance estimators. For a given number of studies and a given distribution, we provided five variance estimators. Confidence intervals are examined with a normal approximation and a nonparametric bootstrap. The accuracy of the different confidence interval estimates was then tested by methods of simulation under different distributional assumptions. The half normal distribution variance estimator has the best probability coverage. Finally, we provide a table of lower confidence intervals for Rosenthal's estimator. PMID:27437470
The binned bispectrum estimator: template-based and non-parametric CMB non-Gaussianity searches
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bucher, Martin; Racine, Benjamin; Tent, Bartjan van, E-mail: bucher@apc.univ-paris7.fr, E-mail: benjar@uio.no, E-mail: vantent@th.u-psud.fr
2016-05-01
We describe the details of the binned bispectrum estimator as used for the official 2013 and 2015 analyses of the temperature and polarization CMB maps from the ESA Planck satellite. The defining aspect of this estimator is the determination of a map bispectrum (3-point correlation function) that has been binned in harmonic space. For a parametric determination of the non-Gaussianity in the map (the so-called f NL parameters), one takes the inner product of this binned bispectrum with theoretically motivated templates. However, as a complementary approach one can also smooth the binned bispectrum using a variable smoothing scale in ordermore » to suppress noise and make coherent features stand out above the noise. This allows one to look in a model-independent way for any statistically significant bispectral signal. This approach is useful for characterizing the bispectral shape of the galactic foreground emission, for which a theoretical prediction of the bispectral anisotropy is lacking, and for detecting a serendipitous primordial signal, for which a theoretical template has not yet been put forth. Both the template-based and the non-parametric approaches are described in this paper.« less
NASA Astrophysics Data System (ADS)
Oesterle, Jonathan; Lionel, Amodeo
2018-06-01
The current competitive situation increases the importance of realistically estimating product costs during the early phases of product and assembly line planning projects. In this article, several multi-objective algorithms using difference dominance rules are proposed to solve the problem associated with the selection of the most effective combination of product and assembly lines. The list of developed algorithms includes variants of ant colony algorithms, evolutionary algorithms and imperialist competitive algorithms. The performance of each algorithm and dominance rule is analysed by five multi-objective quality indicators and fifty problem instances. The algorithms and dominance rules are ranked using a non-parametric statistical test.
CASPASE-12 and rheumatoid arthritis in African-Americans
Marshall, Laura; Obaidullah, Mohammad; Fuchs, Trista; Fineberg, Naomi S.; Brinkley, Garland; Mikuls, Ted R.; Bridges, S. Louis; Hermel, Evan
2014-01-01
CASPASE-12 (CASP12) has a down-regulatory function during infection, and thus may protect against inflammatory disease. We investigated the distribution of CASP12 alleles (#rs497116) in African-Americans (AA) with rheumatoid arthritis (RA). CASP12 alleles were genotyped in 953 RA patients and 342 controls. Statistical analyses comparing genotype groups were performed using Kruskal-Wallis non-parametric ANOVA with Mann-Whitney U tests and chi-square tests. There was no significant difference in the overall distribution of CASP12 genotypes within AA with RA, but CASP12 homozygous patients had lower baseline joint narrowing scores. CASP12 homozygosity appears to be a subtle protective factor for some aspects of RA in AA patients. PMID:24515649
Goodness-Of-Fit Test for Nonparametric Regression Models: Smoothing Spline ANOVA Models as Example.
Teran Hidalgo, Sebastian J; Wu, Michael C; Engel, Stephanie M; Kosorok, Michael R
2018-06-01
Nonparametric regression models do not require the specification of the functional form between the outcome and the covariates. Despite their popularity, the amount of diagnostic statistics, in comparison to their parametric counter-parts, is small. We propose a goodness-of-fit test for nonparametric regression models with linear smoother form. In particular, we apply this testing framework to smoothing spline ANOVA models. The test can consider two sources of lack-of-fit: whether covariates that are not currently in the model need to be included, and whether the current model fits the data well. The proposed method derives estimated residuals from the model. Then, statistical dependence is assessed between the estimated residuals and the covariates using the HSIC. If dependence exists, the model does not capture all the variability in the outcome associated with the covariates, otherwise the model fits the data well. The bootstrap is used to obtain p-values. Application of the method is demonstrated with a neonatal mental development data analysis. We demonstrate correct type I error as well as power performance through simulations.
An entropy-based nonparametric test for the validation of surrogate endpoints.
Miao, Xiaopeng; Wang, Yong-Cheng; Gangopadhyay, Ashis
2012-06-30
We present a nonparametric test to validate surrogate endpoints based on measure of divergence and random permutation. This test is a proposal to directly verify the Prentice statistical definition of surrogacy. The test does not impose distributional assumptions on the endpoints, and it is robust to model misspecification. Our simulation study shows that the proposed nonparametric test outperforms the practical test of the Prentice criterion in terms of both robustness of size and power. We also evaluate the performance of three leading methods that attempt to quantify the effect of surrogate endpoints. The proposed method is applied to validate magnetic resonance imaging lesions as the surrogate endpoint for clinical relapses in a multiple sclerosis trial. Copyright © 2012 John Wiley & Sons, Ltd.
Nonparametric estimation of plant density by the distance method
Patil, S.A.; Burnham, K.P.; Kovner, J.L.
1979-01-01
A relation between the plant density and the probability density function of the nearest neighbor distance (squared) from a random point is established under fairly broad conditions. Based upon this relationship, a nonparametric estimator for the plant density is developed and presented in terms of order statistics. Consistency and asymptotic normality of the estimator are discussed. An interval estimator for the density is obtained. The modifications of this estimator and its variance are given when the distribution is truncated. Simulation results are presented for regular, random and aggregated populations to illustrate the nonparametric estimator and its variance. A numerical example from field data is given. Merits and deficiencies of the estimator are discussed with regard to its robustness and variance.
ERIC Educational Resources Information Center
Zheng, Yinggan; Gierl, Mark J.; Cui, Ying
2010-01-01
This study combined the kernel smoothing procedure and a nonparametric differential item functioning statistic--Cochran's Z--to statistically test the difference between the kernel-smoothed item response functions for reference and focal groups. Simulation studies were conducted to investigate the Type I error and power of the proposed…
Liu, Yuewei; Chen, Weihong
2012-02-01
As a nonparametric method, the Kruskal-Wallis test is widely used to compare three or more independent groups when an ordinal or interval level of data is available, especially when the assumptions of analysis of variance (ANOVA) are not met. If the Kruskal-Wallis statistic is statistically significant, Nemenyi test is an alternative method for further pairwise multiple comparisons to locate the source of significance. Unfortunately, most popular statistical packages do not integrate the Nemenyi test, which is not easy to be calculated by hand. We described the theory and applications of the Kruskal-Wallis and Nemenyi tests, and presented a flexible SAS macro to implement the two tests. The SAS macro was demonstrated by two examples from our cohort study in occupational epidemiology. It provides a useful tool for SAS users to test the differences among three or more independent groups using a nonparametric method.
Progressive statistics for studies in sports medicine and exercise science.
Hopkins, William G; Marshall, Stephen W; Batterham, Alan M; Hanin, Juri
2009-01-01
Statistical guidelines and expert statements are now available to assist in the analysis and reporting of studies in some biomedical disciplines. We present here a more progressive resource for sample-based studies, meta-analyses, and case studies in sports medicine and exercise science. We offer forthright advice on the following controversial or novel issues: using precision of estimation for inferences about population effects in preference to null-hypothesis testing, which is inadequate for assessing clinical or practical importance; justifying sample size via acceptable precision or confidence for clinical decisions rather than via adequate power for statistical significance; showing SD rather than SEM, to better communicate the magnitude of differences in means and nonuniformity of error; avoiding purely nonparametric analyses, which cannot provide inferences about magnitude and are unnecessary; using regression statistics in validity studies, in preference to the impractical and biased limits of agreement; making greater use of qualitative methods to enrich sample-based quantitative projects; and seeking ethics approval for public access to the depersonalized raw data of a study, to address the need for more scrutiny of research and better meta-analyses. Advice on less contentious issues includes the following: using covariates in linear models to adjust for confounders, to account for individual differences, and to identify potential mechanisms of an effect; using log transformation to deal with nonuniformity of effects and error; identifying and deleting outliers; presenting descriptive, effect, and inferential statistics in appropriate formats; and contending with bias arising from problems with sampling, assignment, blinding, measurement error, and researchers' prejudices. This article should advance the field by stimulating debate, promoting innovative approaches, and serving as a useful checklist for authors, reviewers, and editors.
Dissecting effects of complex mixtures: who's afraid of informative priors?
Thomas, Duncan C; Witte, John S; Greenland, Sander
2007-03-01
Epidemiologic studies commonly investigate multiple correlated exposures, which are difficult to analyze appropriately. Hierarchical modeling provides a promising approach for analyzing such data by adding a higher-level structure or prior model for the exposure effects. This prior model can incorporate additional information on similarities among the correlated exposures and can be parametric, semiparametric, or nonparametric. We discuss the implications of applying these models and argue for their expanded use in epidemiology. While a prior model adds assumptions to the conventional (first-stage) model, all statistical methods (including conventional methods) make strong intrinsic assumptions about the processes that generated the data. One should thus balance prior modeling assumptions against assumptions of validity, and use sensitivity analyses to understand their implications. In doing so - and by directly incorporating into our analyses information from other studies or allied fields - we can improve our ability to distinguish true causes of disease from noise and bias.
Gundogdu, Ahmet Gokhan; Onder, Sevgen; Firat, Pinar; Dogan, Riza
2014-06-01
The impacts of epidermal growth factor receptor (EGFR) immunoexpression and RAS immunoexpression on the survival and prognosis of lung adenocarcinoma patients are debated in the literature. Twenty-six patients, who underwent pulmonary resections between 2002 and 2007 in our clinic, and whose pathologic examinations yielded adenocarcinoma, were included in the study. EGFR and RAS expression levels were examined by immunohistochemical methods. The results were compared with the survival, stage of the disease, nodal involvement, lymphovascular invasion, and pleural invasion. Nonparametric bivariate analyses were used for statistical analyses. A significant link between EGFR immunoexpression and survival has been identified while RAS immunoexpression and survival have been proven to be irrelevant. Neither EGFR, nor RAS has displayed a significant link with the stage of the disease, nodal involvement, lymphovascular invasion, or pleural invasion. Positive EGFR immunoexpression affects survival negatively, while RAS immunoexpression has no effect on survival in lung adenocarcinoma patients.
NASA Astrophysics Data System (ADS)
Liao, Meng; To, Quy-Dong; Léonard, Céline; Monchiet, Vincent
2018-03-01
In this paper, we use the molecular dynamics simulation method to study gas-wall boundary conditions. Discrete scattering information of gas molecules at the wall surface is obtained from collision simulations. The collision data can be used to identify the accommodation coefficients for parametric wall models such as Maxwell and Cercignani-Lampis scattering kernels. Since these scattering kernels are based on a limited number of accommodation coefficients, we adopt non-parametric statistical methods to construct the kernel to overcome these issues. Different from parametric kernels, the non-parametric kernels require no parameter (i.e. accommodation coefficients) and no predefined distribution. We also propose approaches to derive directly the Navier friction and Kapitza thermal resistance coefficients as well as other interface coefficients associated with moment equations from the non-parametric kernels. The methods are applied successfully to systems composed of CH4 or CO2 and graphite, which are of interest to the petroleum industry.
Local kernel nonparametric discriminant analysis for adaptive extraction of complex structures
NASA Astrophysics Data System (ADS)
Li, Quanbao; Wei, Fajie; Zhou, Shenghan
2017-05-01
The linear discriminant analysis (LDA) is one of popular means for linear feature extraction. It usually performs well when the global data structure is consistent with the local data structure. Other frequently-used approaches of feature extraction usually require linear, independence, or large sample condition. However, in real world applications, these assumptions are not always satisfied or cannot be tested. In this paper, we introduce an adaptive method, local kernel nonparametric discriminant analysis (LKNDA), which integrates conventional discriminant analysis with nonparametric statistics. LKNDA is adept in identifying both complex nonlinear structures and the ad hoc rule. Six simulation cases demonstrate that LKNDA have both parametric and nonparametric algorithm advantages and higher classification accuracy. Quartic unilateral kernel function may provide better robustness of prediction than other functions. LKNDA gives an alternative solution for discriminant cases of complex nonlinear feature extraction or unknown feature extraction. At last, the application of LKNDA in the complex feature extraction of financial market activities is proposed.
Headache in acute ischaemic stroke: a lesion mapping study.
Seifert, Christian L; Schönbach, Etienne M; Magon, Stefano; Gross, Elena; Zimmer, Claus; Förschler, Anette; Tölle, Thomas R; Mühlau, Mark; Sprenger, Till; Poppert, Holger
2016-01-01
Headache is a common symptom in acute ischaemic stroke, but the underlying mechanisms are incompletely understood. The aim of this lesion mapping study was to identify brain regions, which are related to the development of headache in acute ischaemic stroke. Patients with acute ischaemic stroke (n = 100) were assessed by brain MRI at 3 T including diffusion weighted imaging. We included 50 patients with stroke and headache as well as 50 patients with stroke but no headache symptoms. Infarcts were manually outlined and images were transformed into standard stereotaxic space using non-linear warping. Voxel-wise overlap and subtraction analyses of lesions as well as non-parametric statistics were conducted. The same analyses were carried out by flipping of left-sided lesions, so that all strokes were transformed to the same hemisphere. Between the headache group as well as the non-headache there was no difference in infarct volumes, in the distribution of affected vascular beds or in the clinical severity of strokes. The headache phenotype was tension-type like in most cases. Subtraction analysis revealed that in headache sufferers infarctions were more often distributed in two well-known areas of the central pain matrix: the insula and the somatosensory cortex. This result was confirmed in the flipped analysis and by non-parametric statistical testing (whole brain corrected P-value < 0.01). To the best of our knowledge, this is the first lesion mapping study investigating potential lesional patterns associated with headache in acute ischaemic stroke. Insular strokes turned out to be strongly associated with headache. As the insular cortex is a well-established region in pain processing, our results suggest that, at least in a subgroup of patients, acute stroke-related headache might be centrally driven. © The Author (2015). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
ERIC Educational Resources Information Center
Bellera, Carine A.; Julien, Marilyse; Hanley, James A.
2010-01-01
The Wilcoxon statistics are usually taught as nonparametric alternatives for the 1- and 2-sample Student-"t" statistics in situations where the data appear to arise from non-normal distributions, or where sample sizes are so small that we cannot check whether they do. In the past, critical values, based on exact tail areas, were…
Statistical Package User’s Guide.
1980-08-01
261 C. STACH Nonparametric Descriptive Statistics ... ......... ... 265 D. CHIRA Coefficient of Concordance...135 I.- -a - - W 7- Test Data: This program was tested using data from John Neter and William Wasserman, Applied Linear Statistical Models: Regression...length of data file e. new fileý name (not same as raw data file) 5. Printout as optioned for only. Comments: Ranked data are used for program CHIRA
Can Percentiles Replace Raw Scores in the Statistical Analysis of Test Data?
ERIC Educational Resources Information Center
Zimmerman, Donald W.; Zumbo, Bruno D.
2005-01-01
Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…
Modelling the Effects of Land-Use Changes on Climate: a Case Study on Yamula DAM
NASA Astrophysics Data System (ADS)
Köylü, Ü.; Geymen, A.
2016-10-01
Dams block flow of rivers and cause artificial water reservoirs which affect the climate and the land use characteristics of the river basin. In this research, the effect of the huge water body obtained by Yamula Dam in Kızılırmak Basin is analysed over surrounding spatial's land use and climate change. Mann Kendal non-parametrical statistical test, Theil&Sen Slope method, Inverse Distance Weighting (IDW), Soil Conservation Service-Curve Number (SCS-CN) methods are integrated for spatial and temporal analysis of the research area. For this research humidity, temperature, wind speed, precipitation observations which are collected in 16 weather stations nearby Kızılırmak Basin are analyzed. After that these statistical information is combined by GIS data over years. An application is developed for GIS analysis in Python Programming Language and integrated with ArcGIS software. Statistical analysis calculated in the R Project for Statistical Computing and integrated with developed application. According to the statistical analysis of extracted time series of meteorological parameters, statistical significant spatiotemporal trends are observed for climate change and land use characteristics. In this study, we indicated the effect of big dams in local climate on semi-arid Yamula Dam.
A nonparametric smoothing method for assessing GEE models with longitudinal binary data.
Lin, Kuo-Chin; Chen, Yi-Ju; Shyr, Yu
2008-09-30
Studies involving longitudinal binary responses are widely applied in the health and biomedical sciences research and frequently analyzed by generalized estimating equations (GEE) method. This article proposes an alternative goodness-of-fit test based on the nonparametric smoothing approach for assessing the adequacy of GEE fitted models, which can be regarded as an extension of the goodness-of-fit test of le Cessie and van Houwelingen (Biometrics 1991; 47:1267-1282). The expectation and approximate variance of the proposed test statistic are derived. The asymptotic distribution of the proposed test statistic in terms of a scaled chi-squared distribution and the power performance of the proposed test are discussed by simulation studies. The testing procedure is demonstrated by two real data. Copyright (c) 2008 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Cernesson, Flavie; Tournoud, Marie-George; Lalande, Nathalie
2018-06-01
Among the various parameters monitored in river monitoring networks, bioindicators provide very informative data. Analysing time variations in bioindicator data is tricky for water managers because the data sets are often short, irregular, and non-normally distributed. It is then a challenging methodological issue for scientists, as it is in Saône basin (30 000 km2, France) where, between 1998 and 2010, among 812 IBGN (French macroinvertebrate bioindicator) monitoring stations, only 71 time series have got more than 10 data values and were studied here. Combining various analytical tools (three parametric and non-parametric statistical tests plus a graphical analysis), 45 IBGN time series were classified as stationary and 26 as non-stationary (only one of which showing a degradation). Series from sampling stations located within the same hydroecoregion showed similar trends, while river size classes seemed to be non-significant to explain temporal trends. So, from a methodological point of view, combining statistical tests and graphical analysis is a relevant option when striving to improve trend detection. Moreover, it was possible to propose a way to summarise series in order to analyse links between ecological river quality indicators and land use stressors.
Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science
ERIC Educational Resources Information Center
Ju, Boryung; Jin, Tao
2013-01-01
Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…
Nonparametric Bayesian predictive distributions for future order statistics
Richard A. Johnson; James W. Evans; David W. Green
1999-01-01
We derive the predictive distribution for a specified order statistic, determined from a future random sample, under a Dirichlet process prior. Two variants of the approach are treated and some limiting cases studied. A practical application to monitoring the strength of lumber is discussed including choices of prior expectation and comparisons made to a Bayesian...
Spectral analysis method for detecting an element
Blackwood, Larry G [Idaho Falls, ID; Edwards, Andrew J [Idaho Falls, ID; Jewell, James K [Idaho Falls, ID; Reber, Edward L [Idaho Falls, ID; Seabury, Edward H [Idaho Falls, ID
2008-02-12
A method for detecting an element is described and which includes the steps of providing a gamma-ray spectrum which has a region of interest which corresponds with a small amount of an element to be detected; providing nonparametric assumptions about a shape of the gamma-ray spectrum in the region of interest, and which would indicate the presence of the element to be detected; and applying a statistical test to the shape of the gamma-ray spectrum based upon the nonparametric assumptions to detect the small amount of the element to be detected.
NASA Astrophysics Data System (ADS)
Karpenko, S. S.; Zybin, E. Yu; Kosyanchuk, V. V.
2018-02-01
In this paper we design a nonparametric method for failures detection and localization in the aircraft control system that uses the measurements of the control signals and the aircraft states only. It doesn’t require a priori information of the aircraft model parameters, training or statistical calculations, and is based on algebraic solvability conditions for the aircraft model identification problem. This makes it possible to significantly increase the efficiency of detection and localization problem solution by completely eliminating errors, associated with aircraft model uncertainties.
A semi-nonparametric Poisson regression model for analyzing motor vehicle crash data.
Ye, Xin; Wang, Ke; Zou, Yajie; Lord, Dominique
2018-01-01
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.
Comparing nonparametric Bayesian tree priors for clonal reconstruction of tumors.
Deshwar, Amit G; Vembu, Shankar; Morris, Quaid
2015-01-01
Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor…
Rank-based permutation approaches for non-parametric factorial designs.
Umlauft, Maria; Konietschke, Frank; Pauly, Markus
2017-11-01
Inference methods for null hypotheses formulated in terms of distribution functions in general non-parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set-up Wald-type statistics and ANOVA-type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal-Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability. © 2017 The British Psychological Society.
Practical statistics in pain research.
Kim, Tae Kyun
2017-10-01
Pain is subjective, while statistics related to pain research are objective. This review was written to help researchers involved in pain research make statistical decisions. The main issues are related with the level of scales that are often used in pain research, the choice of statistical methods between parametric or nonparametric statistics, and problems which arise from repeated measurements. In the field of pain research, parametric statistics used to be applied in an erroneous way. This is closely related with the scales of data and repeated measurements. The level of scales includes nominal, ordinal, interval, and ratio scales. The level of scales affects the choice of statistics between parametric or non-parametric methods. In the field of pain research, the most frequently used pain assessment scale is the ordinal scale, which would include the visual analogue scale (VAS). There used to be another view, however, which considered the VAS to be an interval or ratio scale, so that the usage of parametric statistics would be accepted practically in some cases. Repeated measurements of the same subjects always complicates statistics. It means that measurements inevitably have correlations between each other, and would preclude the application of one-way ANOVA in which independence between the measurements is necessary. Repeated measures of ANOVA (RMANOVA), however, would permit the comparison between the correlated measurements as long as the condition of sphericity assumption is satisfied. Conclusively, parametric statistical methods should be used only when the assumptions of parametric statistics, such as normality and sphericity, are established.
Nonparametric tests for equality of psychometric functions.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2017-12-07
Many empirical studies measure psychometric functions (curves describing how observers' performance varies with stimulus magnitude) because these functions capture the effects of experimental conditions. To assess these effects, parametric curves are often fitted to the data and comparisons are carried out by testing for equality of mean parameter estimates across conditions. This approach is parametric and, thus, vulnerable to violations of the implied assumptions. Furthermore, testing for equality of means of parameters may be misleading: Psychometric functions may vary meaningfully across conditions on an observer-by-observer basis with no effect on the mean values of the estimated parameters. Alternative approaches to assess equality of psychometric functions per se are thus needed. This paper compares three nonparametric tests that are applicable in all situations of interest: The existing generalized Mantel-Haenszel test, a generalization of the Berry-Mielke test that was developed here, and a split variant of the generalized Mantel-Haenszel test also developed here. Their statistical properties (accuracy and power) are studied via simulation and the results show that all tests are indistinguishable as to accuracy but they differ non-uniformly as to power. Empirical use of the tests is illustrated via analyses of published data sets and practical recommendations are given. The computer code in MATLAB and R to conduct these tests is available as Electronic Supplemental Material.
Nikita, Efthymia
2014-03-01
The current article explores whether the application of generalized linear models (GLM) and generalized estimating equations (GEE) can be used in place of conventional statistical analyses in the study of ordinal data that code an underlying continuous variable, like entheseal changes. The analysis of artificial data and ordinal data expressing entheseal changes in archaeological North African populations gave the following results. Parametric and nonparametric tests give convergent results particularly for P values <0.1, irrespective of whether the underlying variable is normally distributed or not under the condition that the samples involved in the tests exhibit approximately equal sizes. If this prerequisite is valid and provided that the samples are of equal variances, analysis of covariance may be adopted. GLM are not subject to constraints and give results that converge to those obtained from all nonparametric tests. Therefore, they can be used instead of traditional tests as they give the same amount of information as them, but with the advantage of allowing the study of the simultaneous impact of multiple predictors and their interactions and the modeling of the experimental data. However, GLM should be replaced by GEE for the study of bilateral asymmetry and in general when paired samples are tested, because GEE are appropriate for correlated data. Copyright © 2013 Wiley Periodicals, Inc.
Randomization Procedures Applied to Analysis of Ballistic Data
1991-06-01
test,;;15. NUMBER OF PAGES data analysis; computationally intensive statistics ; randomization tests; permutation tests; 16 nonparametric statistics ...be 0.13. 8 Any reasonable statistical procedure would fail to support the notion of improvement of dynamic over standard indexing based on this data ...AD-A238 389 TECHNICAL REPORT BRL-TR-3245 iBRL RANDOMIZATION PROCEDURES APPLIED TO ANALYSIS OF BALLISTIC DATA MALCOLM S. TAYLOR BARRY A. BODT - JUNE
Key statistical and analytical issues for evaluating treatment effects in periodontal research.
Tu, Yu-Kang; Gilthorpe, Mark S
2012-06-01
Statistics is an indispensible tool for evaluating treatment effects in clinical research. Due to the complexities of periodontal disease progression and data collection, statistical analyses for periodontal research have been a great challenge for both clinicians and statisticians. The aim of this article is to provide an overview of several basic, but important, statistical issues related to the evaluation of treatment effects and to clarify some common statistical misconceptions. Some of these issues are general, concerning many disciplines, and some are unique to periodontal research. We first discuss several statistical concepts that have sometimes been overlooked or misunderstood by periodontal researchers. For instance, decisions about whether to use the t-test or analysis of covariance, or whether to use parametric tests such as the t-test or its non-parametric counterpart, the Mann-Whitney U-test, have perplexed many periodontal researchers. We also describe more advanced methodological issues that have sometimes been overlooked by researchers. For instance, the phenomenon of regression to the mean is a fundamental issue to be considered when evaluating treatment effects, and collinearity amongst covariates is a conundrum that must be resolved when explaining and predicting treatment effects. Quick and easy solutions to these methodological and analytical issues are not always available in the literature, and careful statistical thinking is paramount when conducting useful and meaningful research. © 2012 John Wiley & Sons A/S.
Review of Statistical Methods for Analysing Healthcare Resources and Costs
Mihaylova, Borislava; Briggs, Andrew; O'Hagan, Anthony; Thompson, Simon G
2011-01-01
We review statistical methods for analysing healthcare resource use and costs, their ability to address skewness, excess zeros, multimodality and heavy right tails, and their ease for general use. We aim to provide guidance on analysing resource use and costs focusing on randomised trials, although methods often have wider applicability. Twelve broad categories of methods were identified: (I) methods based on the normal distribution, (II) methods following transformation of data, (III) single-distribution generalized linear models (GLMs), (IV) parametric models based on skewed distributions outside the GLM family, (V) models based on mixtures of parametric distributions, (VI) two (or multi)-part and Tobit models, (VII) survival methods, (VIII) non-parametric methods, (IX) methods based on truncation or trimming of data, (X) data components models, (XI) methods based on averaging across models, and (XII) Markov chain methods. Based on this review, our recommendations are that, first, simple methods are preferred in large samples where the near-normality of sample means is assured. Second, in somewhat smaller samples, relatively simple methods, able to deal with one or two of above data characteristics, may be preferable but checking sensitivity to assumptions is necessary. Finally, some more complex methods hold promise, but are relatively untried; their implementation requires substantial expertise and they are not currently recommended for wider applied work. Copyright © 2010 John Wiley & Sons, Ltd. PMID:20799344
Effect of censoring trace-level water-quality data on trend-detection capability
Gilliom, R.J.; Hirsch, R.M.; Gilroy, E.J.
1984-01-01
Monte Carlo experiments were used to evaluate whether trace-level water-quality data that are routinely censored (not reported) contain valuable information for trend detection. Measurements are commonly censored if they fall below a level associated with some minimum acceptable level of reliability (detection limit). Trace-level organic data were simulated with best- and worst-case estimates of measurement uncertainty, various concentrations and degrees of linear trend, and different censoring rules. The resulting classes of data were subjected to a nonparametric statistical test for trend. For all classes of data evaluated, trends were most effectively detected in uncensored data as compared to censored data even when the data censored were highly unreliable. Thus, censoring data at any concentration level may eliminate valuable information. Whether or not valuable information for trend analysis is, in fact, eliminated by censoring of actual rather than simulated data depends on whether the analytical process is in statistical control and bias is predictable for a particular type of chemical analyses.
Stark, J.R.; Busch, J.P.; Deters, M.H.
1991-01-01
The Kruskil-Wallis test, a nonparametric that for 12 of the 21 constituents sampled in groups in the unconfined-drift aquifer, a of these constituents and land use was found statistical technique, indicated common in all land-use type relation between the concentration to be statistically significant.
Antweiler, Ronald C.; Taylor, Howard E.
2008-01-01
The main classes of statistical treatment of below-detection limit (left-censored) environmental data for the determination of basic statistics that have been used in the literature are substitution methods, maximum likelihood, regression on order statistics (ROS), and nonparametric techniques. These treatments, along with using all instrument-generated data (even those below detection), were evaluated by examining data sets in which the true values of the censored data were known. It was found that for data sets with less than 70% censored data, the best technique overall for determination of summary statistics was the nonparametric Kaplan-Meier technique. ROS and the two substitution methods of assigning one-half the detection limit value to censored data or assigning a random number between zero and the detection limit to censored data were adequate alternatives. The use of these two substitution methods, however, requires a thorough understanding of how the laboratory censored the data. The technique of employing all instrument-generated data - including numbers below the detection limit - was found to be less adequate than the above techniques. At high degrees of censoring (greater than 70% censored data), no technique provided good estimates of summary statistics. Maximum likelihood techniques were found to be far inferior to all other treatments except substituting zero or the detection limit value to censored data.
Applications of non-parametric statistics and analysis of variance on sample variances
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Nonparametric methods that are available for NASA-type applications are discussed. An attempt will be made here to survey what can be used, to attempt recommendations as to when each would be applicable, and to compare the methods, when possible, with the usual normal-theory procedures that are avavilable for the Gaussion analog. It is important here to point out the hypotheses that are being tested, the assumptions that are being made, and limitations of the nonparametric procedures. The appropriateness of doing analysis of variance on sample variances are also discussed and studied. This procedure is followed in several NASA simulation projects. On the surface this would appear to be reasonably sound procedure. However, difficulties involved center around the normality problem and the basic homogeneous variance assumption that is mase in usual analysis of variance problems. These difficulties discussed and guidelines given for using the methods.
Empirically Estimable Classification Bounds Based on a Nonparametric Divergence Measure
Berisha, Visar; Wisler, Alan; Hero, Alfred O.; Spanias, Andreas
2015-01-01
Information divergence functions play a critical role in statistics and information theory. In this paper we show that a non-parametric f-divergence measure can be used to provide improved bounds on the minimum binary classification probability of error for the case when the training and test data are drawn from the same distribution and for the case where there exists some mismatch between training and test distributions. We confirm the theoretical results by designing feature selection algorithms using the criteria from these bounds and by evaluating the algorithms on a series of pathological speech classification tasks. PMID:26807014
Onder, Sevgen; Firat, Pinar; Dogan, Riza
2014-01-01
Background The impacts of epidermal growth factor receptor (EGFR) immunoexpression and RAS immunoexpression on the survival and prognosis of lung adenocarcinoma patients are debated in the literature. Methods Twenty-six patients, who underwent pulmonary resections between 2002 and 2007 in our clinic, and whose pathologic examinations yielded adenocarcinoma, were included in the study. EGFR and RAS expression levels were examined by immunohistochemical methods. The results were compared with the survival, stage of the disease, nodal involvement, lymphovascular invasion, and pleural invasion. Nonparametric bivariate analyses were used for statistical analyses. Results A significant link between EGFR immunoexpression and survival has been identified while RAS immunoexpression and survival have been proven to be irrelevant. Neither EGFR, nor RAS has displayed a significant link with the stage of the disease, nodal involvement, lymphovascular invasion, or pleural invasion. Conclusions Positive EGFR immunoexpression affects survival negatively, while RAS immunoexpression has no effect on survival in lung adenocarcinoma patients. PMID:24977003
Weber, H M; Rücker, S; Büttner, P; Petermann, F; Daseking, M
2015-10-01
General cognitive abilities are still considered as the most important predictor of school achievement and success. Whether the high correlation (r=0.50) can be explained by other variables has not yet been studied. Learning behavior can be discussed as one factor that influences the relationship between general cognitive abilities and school achievement. This study examined the relationship between intelligence, school achievement and learning behavior. Mediator analyses were conducted to check whether learning behavior would mediate the relationship between general cognitive abilities and school grades in mathematics and German. Statistical analyses confirmed that the relationship between general cognitive abilities and school achievement was fully mediated by learning behavior for German, whereas intelligence seemed to be the only predictor for achievement in mathematics. These results could be confirmed by non-parametric bootstrapping procedures. RESULTS indicate that special training of learning behavior may have a positive impact on school success, even for children and adolescents with low IQ. © Georg Thieme Verlag KG Stuttgart · New York.
Confidence intervals for single-case effect size measures based on randomization test inversion.
Michiels, Bart; Heyvaert, Mieke; Meulders, Ann; Onghena, Patrick
2017-02-01
In the current paper, we present a method to construct nonparametric confidence intervals (CIs) for single-case effect size measures in the context of various single-case designs. We use the relationship between a two-sided statistical hypothesis test at significance level α and a 100 (1 - α) % two-sided CI to construct CIs for any effect size measure θ that contain all point null hypothesis θ values that cannot be rejected by the hypothesis test at significance level α. This method of hypothesis test inversion (HTI) can be employed using a randomization test as the statistical hypothesis test in order to construct a nonparametric CI for θ. We will refer to this procedure as randomization test inversion (RTI). We illustrate RTI in a situation in which θ is the unstandardized and the standardized difference in means between two treatments in a completely randomized single-case design. Additionally, we demonstrate how RTI can be extended to other types of single-case designs. Finally, we discuss a few challenges for RTI as well as possibilities when using the method with other effect size measures, such as rank-based nonoverlap indices. Supplementary to this paper, we provide easy-to-use R code, which allows the user to construct nonparametric CIs according to the proposed method.
Muñoz–Negrete, Francisco J.; Oblanca, Noelia; Rebolleda, Gema
2018-01-01
Purpose To study the structure-function relationship in glaucoma and healthy patients assessed with Spectralis OCT and Humphrey perimetry using new statistical approaches. Materials and Methods Eighty-five eyes were prospectively selected and divided into 2 groups: glaucoma (44) and healthy patients (41). Three different statistical approaches were carried out: (1) factor analysis of the threshold sensitivities (dB) (automated perimetry) and the macular thickness (μm) (Spectralis OCT), subsequently applying Pearson's correlation to the obtained regions, (2) nonparametric regression analysis relating the values in each pair of regions that showed significant correlation, and (3) nonparametric spatial regressions using three models designed for the purpose of this study. Results In the glaucoma group, a map that relates structural and functional damage was drawn. The strongest correlation with visual fields was observed in the peripheral nasal region of both superior and inferior hemigrids (r = 0.602 and r = 0.458, resp.). The estimated functions obtained with the nonparametric regressions provided the mean sensitivity that corresponds to each given macular thickness. These functions allowed for accurate characterization of the structure-function relationship. Conclusions Both maps and point-to-point functions obtained linking structure and function damage contribute to a better understanding of this relationship and may help in the future to improve glaucoma diagnosis. PMID:29850196
kruX: matrix-based non-parametric eQTL discovery.
Qi, Jianlong; Asl, Hassan Foroughi; Björkegren, Johan; Michoel, Tom
2014-01-14
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kalsi, G.; Read, T.; Butler, R.
A possible linkage to a genetic subtype of schizophrenia and related disorders has been reported on the long arm of chromosome 22 at q12-13. However formal statistical tests in a combined sample could not reject homogeneity and prove that there was linked subgroup of families. We have studied 23 schizophrenia pedigrees to test whether some multiplex schizophrenia families may be linked to the microsatellite markers D22S274 and D22S283 which span the 22q12-13 region. Two point followed by multipoint lod and non-parametric linkage analyses under the assumption of heterogeneity provided no evidence for linkage over the relevant region. 16 refs., 4more » tabs.« less
Finke, Erinn H; Wilkinson, Krista M; Hickerson, Benjamin D
2017-02-01
The purpose of this study was to understand the social referencing behaviors of children with and without autism spectrum disorder (ASD) while visually attending to a videogame stimulus depicting both the face of the videogame player and the videogame play action. Videogames appear to offer a uniquely well-suited environment for the emergence of friendships, but it is not known if children with and without ASD attend to and play videogames similarly. Eyetracking technology was used to investigate visual attention of participants matched based on chronological age. Parametric and nonparametric statistical analyses were used and results indicated the groups did not differ on percentage of time spent visually attending to any of the areas of interest, with one possible exception.
Halliday, David M; Senik, Mohd Harizal; Stevenson, Carl W; Mason, Rob
2016-08-01
The ability to infer network structure from multivariate neuronal signals is central to computational neuroscience. Directed network analyses typically use parametric approaches based on auto-regressive (AR) models, where networks are constructed from estimates of AR model parameters. However, the validity of using low order AR models for neurophysiological signals has been questioned. A recent article introduced a non-parametric approach to estimate directionality in bivariate data, non-parametric approaches are free from concerns over model validity. We extend the non-parametric framework to include measures of directed conditional independence, using scalar measures that decompose the overall partial correlation coefficient summatively by direction, and a set of functions that decompose the partial coherence summatively by direction. A time domain partial correlation function allows both time and frequency views of the data to be constructed. The conditional independence estimates are conditioned on a single predictor. The framework is applied to simulated cortical neuron networks and mixtures of Gaussian time series data with known interactions. It is applied to experimental data consisting of local field potential recordings from bilateral hippocampus in anaesthetised rats. The framework offers a non-parametric approach to estimation of directed interactions in multivariate neuronal recordings, and increased flexibility in dealing with both spike train and time series data. The framework offers a novel alternative non-parametric approach to estimate directed interactions in multivariate neuronal recordings, and is applicable to spike train and time series data. Copyright © 2016 Elsevier B.V. All rights reserved.
Astigmatism and early academic readiness in preschool children.
Orlansky, Gale; Wilmer, Jeremy; Taub, Marc B; Rutner, Daniella; Ciner, Elise; Gryczynski, Jan
2015-03-01
This study investigated the relationship between uncorrected astigmatism and early academic readiness in at-risk preschool-aged children. A vision screening and academic records review were performed on 122 three- to five-year-old children enrolled in the Philadelphia Head Start program. Vision screening results were related to two measures of early academic readiness, the teacher-reported Work Sampling System (WSS) and the parent-reported Ages and Stages Questionnaire (ASQ). Both measures assess multiple developmental and skill domains thought to be related to academic readiness. Children with astigmatism (defined as >|-0.25| in either eye) were compared with children who had no astigmatism. Associations between astigmatism and specific subscales of the WSS and ASQ were examined using parametric and nonparametric bivariate statistics and regression analyses controlling for age and spherical refractive error. Presence of astigmatism was negatively associated with multiple domains of academic readiness. Children with astigmatism had significantly lower mean scores on Personal and Social Development, Language and Literacy, and Physical Development domains of the WSS, and on Personal/Social, Communication, and Fine Motor domains of the ASQ. These differences between children with astigmatism and children with no astigmatism persisted after statistically adjusting for age and magnitude of spherical refractive error. Nonparametric tests corroborated these findings for the Language and Literacy and Physical Health and Development domains of the WSS and the Communication domain of the ASQ. The presence of astigmatism detected in a screening setting was associated with a pattern of reduced academic readiness in multiple developmental and educational domains among at-risk preschool-aged children. This study may help to establish the role of early vision screenings, comprehensive vision examinations, and the need for refractive correction to improve academic success in preschool children.
Mitra, Rajib; Jordan, Michael I.; Dunbrack, Roland L.
2010-01-01
Distributions of the backbone dihedral angles of proteins have been studied for over 40 years. While many statistical analyses have been presented, only a handful of probability densities are publicly available for use in structure validation and structure prediction methods. The available distributions differ in a number of important ways, which determine their usefulness for various purposes. These include: 1) input data size and criteria for structure inclusion (resolution, R-factor, etc.); 2) filtering of suspect conformations and outliers using B-factors or other features; 3) secondary structure of input data (e.g., whether helix and sheet are included; whether beta turns are included); 4) the method used for determining probability densities ranging from simple histograms to modern nonparametric density estimation; and 5) whether they include nearest neighbor effects on the distribution of conformations in different regions of the Ramachandran map. In this work, Ramachandran probability distributions are presented for residues in protein loops from a high-resolution data set with filtering based on calculated electron densities. Distributions for all 20 amino acids (with cis and trans proline treated separately) have been determined, as well as 420 left-neighbor and 420 right-neighbor dependent distributions. The neighbor-independent and neighbor-dependent probability densities have been accurately estimated using Bayesian nonparametric statistical analysis based on the Dirichlet process. In particular, we used hierarchical Dirichlet process priors, which allow sharing of information between densities for a particular residue type and different neighbor residue types. The resulting distributions are tested in a loop modeling benchmark with the program Rosetta, and are shown to improve protein loop conformation prediction significantly. The distributions are available at http://dunbrack.fccc.edu/hdp. PMID:20442867
Marko, Nicholas F.; Weil, Robert J.
2012-01-01
Introduction Gene expression data is often assumed to be normally-distributed, but this assumption has not been tested rigorously. We investigate the distribution of expression data in human cancer genomes and study the implications of deviations from the normal distribution for translational molecular oncology research. Methods We conducted a central moments analysis of five cancer genomes and performed empiric distribution fitting to examine the true distribution of expression data both on the complete-experiment and on the individual-gene levels. We used a variety of parametric and nonparametric methods to test the effects of deviations from normality on gene calling, functional annotation, and prospective molecular classification using a sixth cancer genome. Results Central moments analyses reveal statistically-significant deviations from normality in all of the analyzed cancer genomes. We observe as much as 37% variability in gene calling, 39% variability in functional annotation, and 30% variability in prospective, molecular tumor subclassification associated with this effect. Conclusions Cancer gene expression profiles are not normally-distributed, either on the complete-experiment or on the individual-gene level. Instead, they exhibit complex, heavy-tailed distributions characterized by statistically-significant skewness and kurtosis. The non-Gaussian distribution of this data affects identification of differentially-expressed genes, functional annotation, and prospective molecular classification. These effects may be reduced in some circumstances, although not completely eliminated, by using nonparametric analytics. This analysis highlights two unreliable assumptions of translational cancer gene expression analysis: that “small” departures from normality in the expression data distributions are analytically-insignificant and that “robust” gene-calling algorithms can fully compensate for these effects. PMID:23118863
Statistical modelling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1991-01-01
During the six-month period from 1 April 1991 to 30 September 1991 the following research papers in statistical modeling of software reliability appeared: (1) A Nonparametric Software Reliability Growth Model; (2) On the Use and the Performance of Software Reliability Growth Models; (3) Research and Development Issues in Software Reliability Engineering; (4) Special Issues on Software; and (5) Software Reliability and Safety.
ERIC Educational Resources Information Center
Schochet, Peter Z.
2015-01-01
This report presents the statistical theory underlying the "RCT-YES" software that estimates and reports impacts for RCTs for a wide range of designs used in social policy research. The report discusses a unified, non-parametric design-based approach for impact estimation using the building blocks of the Neyman-Rubin-Holland causal…
Feder, Paul I; Ma, Zhenxu J; Bull, Richard J; Teuschler, Linda K; Rice, Glenn
2009-01-01
In chemical mixtures risk assessment, the use of dose-response data developed for one mixture to estimate risk posed by a second mixture depends on whether the two mixtures are sufficiently similar. While evaluations of similarity may be made using qualitative judgments, this article uses nonparametric statistical methods based on the "bootstrap" resampling technique to address the question of similarity among mixtures of chemical disinfectant by-products (DBP) in drinking water. The bootstrap resampling technique is a general-purpose, computer-intensive approach to statistical inference that substitutes empirical sampling for theoretically based parametric mathematical modeling. Nonparametric, bootstrap-based inference involves fewer assumptions than parametric normal theory based inference. The bootstrap procedure is appropriate, at least in an asymptotic sense, whether or not the parametric, distributional assumptions hold, even approximately. The statistical analysis procedures in this article are initially illustrated with data from 5 water treatment plants (Schenck et al., 2009), and then extended using data developed from a study of 35 drinking-water utilities (U.S. EPA/AMWA, 1989), which permits inclusion of a greater number of water constituents and increased structure in the statistical models.
On an additive partial correlation operator and nonparametric estimation of graphical models.
Lee, Kuang-Yao; Li, Bing; Zhao, Hongyu
2016-09-01
We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance.
On an additive partial correlation operator and nonparametric estimation of graphical models
Li, Bing; Zhao, Hongyu
2016-01-01
Abstract We introduce an additive partial correlation operator as an extension of partial correlation to the nonlinear setting, and use it to develop a new estimator for nonparametric graphical models. Our graphical models are based on additive conditional independence, a statistical relation that captures the spirit of conditional independence without having to resort to high-dimensional kernels for its estimation. The additive partial correlation operator completely characterizes additive conditional independence, and has the additional advantage of putting marginal variation on appropriate scales when evaluating interdependence, which leads to more accurate statistical inference. We establish the consistency of the proposed estimator. Through simulation experiments and analysis of the DREAM4 Challenge dataset, we demonstrate that our method performs better than existing methods in cases where the Gaussian or copula Gaussian assumption does not hold, and that a more appropriate scaling for our method further enhances its performance. PMID:29422689
Bayesian Nonparametric Ordination for the Analysis of Microbial Communities.
Ren, Boyu; Bacallado, Sergio; Favaro, Stefano; Holmes, Susan; Trippa, Lorenzo
2017-01-01
Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters that capture and describe variations of OTU counts across biological samples. It remains important to evaluate how uncertainty in estimates of each biological sample's microbial distribution propagates to ordination analyses, including visualization of clusters and projections of biological samples on low dimensional spaces. We propose a Bayesian analysis for dependent distributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior, the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariate data analysis techniques we can visualize credible regions in ecological ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets.
Ceppi, Marcello; Gallo, Fabio; Bonassi, Stefano
2011-01-01
The most common study design performed in population studies based on the micronucleus (MN) assay, is the cross-sectional study, which is largely performed to evaluate the DNA damaging effects of exposure to genotoxic agents in the workplace, in the environment, as well as from diet or lifestyle factors. Sample size is still a critical issue in the design of MN studies since most recent studies considering gene-environment interaction, often require a sample size of several hundred subjects, which is in many cases difficult to achieve. The control of confounding is another major threat to the validity of causal inference. The most popular confounders considered in population studies using MN are age, gender and smoking habit. Extensive attention is given to the assessment of effect modification, given the increasing inclusion of biomarkers of genetic susceptibility in the study design. Selected issues concerning the statistical treatment of data have been addressed in this mini-review, starting from data description, which is a critical step of statistical analysis, since it allows to detect possible errors in the dataset to be analysed and to check the validity of assumptions required for more complex analyses. Basic issues dealing with statistical analysis of biomarkers are extensively evaluated, including methods to explore the dose-response relationship among two continuous variables and inferential analysis. A critical approach to the use of parametric and non-parametric methods is presented, before addressing the issue of most suitable multivariate models to fit MN data. In the last decade, the quality of statistical analysis of MN data has certainly evolved, although even nowadays only a small number of studies apply the Poisson model, which is the most suitable method for the analysis of MN data.
Does History Repeat Itself? Wavelets and the Phylodynamics of Influenza A
Tom, Jennifer A.; Sinsheimer, Janet S.; Suchard, Marc A.
2012-01-01
Unprecedented global surveillance of viruses will result in massive sequence data sets that require new statistical methods. These data sets press the limits of Bayesian phylogenetics as the high-dimensional parameters that comprise a phylogenetic tree increase the already sizable computational burden of these techniques. This burden often results in partitioning the data set, for example, by gene, and inferring the evolutionary dynamics of each partition independently, a compromise that results in stratified analyses that depend only on data within a given partition. However, parameter estimates inferred from these stratified models are likely strongly correlated, considering they rely on data from a single data set. To overcome this shortfall, we exploit the existing Monte Carlo realizations from stratified Bayesian analyses to efficiently estimate a nonparametric hierarchical wavelet-based model and learn about the time-varying parameters of effective population size that reflect levels of genetic diversity across all partitions simultaneously. Our methods are applied to complete genome influenza A sequences that span 13 years. We find that broad peaks and trends, as opposed to seasonal spikes, in the effective population size history distinguish individual segments from the complete genome. We also address hypotheses regarding intersegment dynamics within a formal statistical framework that accounts for correlation between segment-specific parameters. PMID:22160768
Rasova, Kamila; Prochazkova, Marie; Tintera, Jaroslav; Ibrahim, Ibrahim; Zimova, Denisa; Stetkarova, Ivana
2015-03-01
There is still little scientific evidence for the efficacy of neurofacilitation approaches and their possible influence on brain plasticity and adaptability. In this study, the outcome of a new kind of neurofacilitation approach, motor programme activating therapy (MPAT), was evaluated on the basis of a set of clinical functions and with MRI. Eighteen patients were examined four times with standardized clinical tests and diffusion tensor imaging to monitor changes without therapy, immediately after therapy and 1 month after therapy. Moreover, the strength of effective connectivity was analysed before and after therapy. Patients underwent a 1-h session of MPAT twice a week for 2 months. The data were analysed by nonparametric tests of association and were subsequently statistically evaluated. The therapy led to significant improvement in clinical functions, significant increment of fractional anisotropy and significant decrement of mean diffusivity, and decrement of effective connectivity at supplementary motor areas was observed immediately after the therapy. Changes in clinical functions and diffusion tensor images persisted 1 month after completing the programme. No statistically significant changes in clinical functions and no differences in MRI-diffusion tensor images were observed without physiotherapy. Positive immediate and long-term effects of MPAT on clinical and brain functions, as well as brain microstructure, were confirmed.
Impact of Business Cycles on US Suicide Rates, 1928–2007
Florence, Curtis S.; Quispe-Agnoli, Myriam; Ouyang, Lijing; Crosby, Alexander E.
2011-01-01
Objectives. We examined the associations of overall and age-specific suicide rates with business cycles from 1928 to 2007 in the United States. Methods. We conducted a graphical analysis of changes in suicide rates during business cycles, used nonparametric analyses to test associations between business cycles and suicide rates, and calculated correlations between the national unemployment rate and suicide rates. Results. Graphical analyses showed that the overall suicide rate generally rose during recessions and fell during expansions. Age-specific suicide rates responded differently to recessions and expansions. Nonparametric tests indicated that the overall suicide rate and the suicide rates of the groups aged 25 to 34 years, 35 to 44 years, 45 to 54 years, and 55 to 64 years rose during contractions and fell during expansions. Suicide rates of the groups aged 15 to 24 years, 65 to 74 years, and 75 years and older did not exhibit this behavior. Correlation results were concordant with all nonparametric results except for the group aged 65 to 74 years. Conclusions. Business cycles may affect suicide rates, although different age groups responded differently. Our findings suggest that public health responses are a necessary component of suicide prevention during recessions. PMID:21493938
Sample Skewness as a Statistical Measurement of Neuronal Tuning Sharpness
Samonds, Jason M.; Potetz, Brian R.; Lee, Tai Sing
2014-01-01
We propose using the statistical measurement of the sample skewness of the distribution of mean firing rates of a tuning curve to quantify sharpness of tuning. For some features, like binocular disparity, tuning curves are best described by relatively complex and sometimes diverse functions, making it difficult to quantify sharpness with a single function and parameter. Skewness provides a robust nonparametric measure of tuning curve sharpness that is invariant with respect to the mean and variance of the tuning curve and is straightforward to apply to a wide range of tuning, including simple orientation tuning curves and complex object tuning curves that often cannot even be described parametrically. Because skewness does not depend on a specific model or function of tuning, it is especially appealing to cases of sharpening where recurrent interactions among neurons produce sharper tuning curves that deviate in a complex manner from the feedforward function of tuning. Since tuning curves for all neurons are not typically well described by a single parametric function, this model independence additionally allows skewness to be applied to all recorded neurons, maximizing the statistical power of a set of data. We also compare skewness with other nonparametric measures of tuning curve sharpness and selectivity. Compared to these other nonparametric measures tested, skewness is best used for capturing the sharpness of multimodal tuning curves defined by narrow peaks (maximum) and broad valleys (minima). Finally, we provide a more formal definition of sharpness using a shape-based information gain measure and derive and show that skewness is correlated with this definition. PMID:24555451
Comparing Pixel- and Object-Based Approaches in Effectively Classifying Wetland-Dominated Landscapes
Berhane, Tedros M.; Lane, Charles R.; Wu, Qiusheng; Anenkhonov, Oleg A.; Chepinoga, Victor V.; Autrey, Bradley C.; Liu, Hongxing
2018-01-01
Wetland ecosystems straddle both terrestrial and aquatic habitats, performing many ecological functions directly and indirectly benefitting humans. However, global wetland losses are substantial. Satellite remote sensing and classification informs wise wetland management and monitoring. Both pixel- and object-based classification approaches using parametric and non-parametric algorithms may be effectively used in describing wetland structure and habitat, but which approach should one select? We conducted both pixel- and object-based image analyses (OBIA) using parametric (Iterative Self-Organizing Data Analysis Technique, ISODATA, and maximum likelihood, ML) and non-parametric (random forest, RF) approaches in the Barguzin Valley, a large wetland (~500 km2) in the Lake Baikal, Russia, drainage basin. Four Quickbird multispectral bands plus various spatial and spectral metrics (e.g., texture, Non-Differentiated Vegetation Index, slope, aspect, etc.) were analyzed using field-based regions of interest sampled to characterize an initial 18 ISODATA-based classes. Parsimoniously using a three-layer stack (Quickbird band 3, water ratio index (WRI), and mean texture) in the analyses resulted in the highest accuracy, 87.9% with pixel-based RF, followed by OBIA RF (segmentation scale 5, 84.6% overall accuracy), followed by pixel-based ML (83.9% overall accuracy). Increasing the predictors from three to five by adding Quickbird bands 2 and 4 decreased the pixel-based overall accuracy while increasing the OBIA RF accuracy to 90.4%. However, McNemar’s chi-square test confirmed no statistically significant difference in overall accuracy among the classifiers (pixel-based ML, RF, or object-based RF) for either the three- or five-layer analyses. Although potentially useful in some circumstances, the OBIA approach requires substantial resources and user input (such as segmentation scale selection—which was found to substantially affect overall accuracy). Hence, we conclude that pixel-based RF approaches are likely satisfactory for classifying wetland-dominated landscapes. PMID:29707381
Berhane, Tedros M; Lane, Charles R; Wu, Qiusheng; Anenkhonov, Oleg A; Chepinoga, Victor V; Autrey, Bradley C; Liu, Hongxing
2018-01-01
Wetland ecosystems straddle both terrestrial and aquatic habitats, performing many ecological functions directly and indirectly benefitting humans. However, global wetland losses are substantial. Satellite remote sensing and classification informs wise wetland management and monitoring. Both pixel- and object-based classification approaches using parametric and non-parametric algorithms may be effectively used in describing wetland structure and habitat, but which approach should one select? We conducted both pixel- and object-based image analyses (OBIA) using parametric (Iterative Self-Organizing Data Analysis Technique, ISODATA, and maximum likelihood, ML) and non-parametric (random forest, RF) approaches in the Barguzin Valley, a large wetland (~500 km 2 ) in the Lake Baikal, Russia, drainage basin. Four Quickbird multispectral bands plus various spatial and spectral metrics (e.g., texture, Non-Differentiated Vegetation Index, slope, aspect, etc.) were analyzed using field-based regions of interest sampled to characterize an initial 18 ISODATA-based classes. Parsimoniously using a three-layer stack (Quickbird band 3, water ratio index (WRI), and mean texture) in the analyses resulted in the highest accuracy, 87.9% with pixel-based RF, followed by OBIA RF (segmentation scale 5, 84.6% overall accuracy), followed by pixel-based ML (83.9% overall accuracy). Increasing the predictors from three to five by adding Quickbird bands 2 and 4 decreased the pixel-based overall accuracy while increasing the OBIA RF accuracy to 90.4%. However, McNemar's chi-square test confirmed no statistically significant difference in overall accuracy among the classifiers (pixel-based ML, RF, or object-based RF) for either the three- or five-layer analyses. Although potentially useful in some circumstances, the OBIA approach requires substantial resources and user input (such as segmentation scale selection-which was found to substantially affect overall accuracy). Hence, we conclude that pixel-based RF approaches are likely satisfactory for classifying wetland-dominated landscapes.
Shete, Sanjay; Lau, Ching C; Houlston, Richard S; Claus, Elizabeth B; Barnholtz-Sloan, Jill; Lai, Rose; Il’yasova, Dora; Schildkraut, Joellen; Sadetzki, Siegal; Johansen, Christoffer; Bernstein, Jonine L; Olson, Sara H; Jenkins, Robert B; Yang, Ping; Vick, Nicholas A; Wrensch, Margaret; Davis, Faith G; McCarthy, Bridget J; Leung, Eastwood Hon-chiu; Davis, Caleb; Cheng, Rita; Hosking, Fay J; Armstrong, Georgina N; Liu, Yanhong; Yu, Robert K; Henriksson, Roger; Consortium, The Gliogene; Melin, Beatrice S; Bondy, Melissa L
2011-01-01
Gliomas, which generally have a poor prognosis, are the most common primary malignant brain tumors in adults. Recent genome-wide association studies have demonstrated that inherited susceptibility plays a role in the development of glioma. Although first-degree relatives of patients exhibit a two-fold increased risk of glioma, the search for susceptibility loci in familial forms of the disease has been challenging because the disease is relatively rare, fatal, and heterogeneous, making it difficult to collect sufficient biosamples from families for statistical power. To address this challenge, the Genetic Epidemiology of Glioma International Consortium (Gliogene) was formed to collect DNA samples from families with two or more cases of histologically confirmed glioma. In this study, we present results obtained from 46 U.S. families in which multipoint linkage analyses were undertaken using nonparametric (model-free) methods. After removal of high linkage disequilibrium SNPs, we obtained a maximum nonparametric linkage score (NPL) of 3.39 (P=0.0005) at 17q12–21.32 and the Z-score of 4.20 (P=0.000007). To replicate our findings, we genotyped 29 independent U.S. families and obtained a maximum NPL score of 1.26 (P=0.008) and the Z-score of 1.47 (P=0.035). Accounting for the genetic heterogeneity using the ordered subset analysis approach, the combined analyses of 75 families resulted in a maximum NPL score of 3.81 (P=0.00001). The genomic regions we have implicated in this study may offer novel insights into glioma susceptibility, focusing future work to identify genes that cause familial glioma. PMID:22037877
Tips and Tricks for Successful Application of Statistical Methods to Biological Data.
Schlenker, Evelyn
2016-01-01
This chapter discusses experimental design and use of statistics to describe characteristics of data (descriptive statistics) and inferential statistics that test the hypothesis posed by the investigator. Inferential statistics, based on probability distributions, depend upon the type and distribution of the data. For data that are continuous, randomly and independently selected, as well as normally distributed more powerful parametric tests such as Student's t test and analysis of variance (ANOVA) can be used. For non-normally distributed or skewed data, transformation of the data (using logarithms) may normalize the data allowing use of parametric tests. Alternatively, with skewed data nonparametric tests can be utilized, some of which rely on data that are ranked prior to statistical analysis. Experimental designs and analyses need to balance between committing type 1 errors (false positives) and type 2 errors (false negatives). For a variety of clinical studies that determine risk or benefit, relative risk ratios (random clinical trials and cohort studies) or odds ratios (case-control studies) are utilized. Although both use 2 × 2 tables, their premise and calculations differ. Finally, special statistical methods are applied to microarray and proteomics data, since the large number of genes or proteins evaluated increase the likelihood of false discoveries. Additional studies in separate samples are used to verify microarray and proteomic data. Examples in this chapter and references are available to help continued investigation of experimental designs and appropriate data analysis.
The Importance of Practice in the Development of Statistics.
1983-01-01
RESOLUTION TEST CHART NATIONAL BUREAU OIF STANDARDS 1963 -A NRC Technical Summary Report #2471 C THE IMORTANCE OF PRACTICE IN to THE DEVELOPMENT OF STATISTICS...component analysis, bioassay, limits for a ratio, quality control, sampling inspection, non-parametric tests , transformation theory, ARIMA time series...models, sequential tests , cumulative sum charts, data analysis plotting techniques, and a resolution of the Bayes - frequentist controversy. It appears
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2010-07-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root- n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided.
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2013-01-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
Kang, Le; Chen, Weijie; Petrick, Nicholas A.; Gallas, Brandon D.
2014-01-01
The area under the receiver operating characteristic (ROC) curve (AUC) is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of AUC, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. PMID:25399736
kruX: matrix-based non-parametric eQTL discovery
2014-01-01
Background The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. Results We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. Conclusion kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure and is freely available from http://krux.googlecode.com. PMID:24423115
Parametric and nonparametric Granger causality testing: Linkages between international stock markets
NASA Astrophysics Data System (ADS)
De Gooijer, Jan G.; Sivarajasingham, Selliah
2008-04-01
This study investigates long-term linear and nonlinear causal linkages among eleven stock markets, six industrialized markets and five emerging markets of South-East Asia. We cover the period 1987-2006, taking into account the on-set of the Asian financial crisis of 1997. We first apply a test for the presence of general nonlinearity in vector time series. Substantial differences exist between the pre- and post-crisis period in terms of the total number of significant nonlinear relationships. We then examine both periods, using a new nonparametric test for Granger noncausality and the conventional parametric Granger noncausality test. One major finding is that the Asian stock markets have become more internationally integrated after the Asian financial crisis. An exception is the Sri Lankan market with almost no significant long-term linear and nonlinear causal linkages with other markets. To ensure that any causality is strictly nonlinear in nature, we also examine the nonlinear causal relationships of VAR filtered residuals and VAR filtered squared residuals for the post-crisis sample. We find quite a few remaining significant bi- and uni-directional causal nonlinear relationships in these series. Finally, after filtering the VAR-residuals with GARCH-BEKK models, we show that the nonparametric test statistics are substantially smaller in both magnitude and statistical significance than those before filtering. This indicates that nonlinear causality can, to a large extent, be explained by simple volatility effects.
Combined non-parametric and parametric approach for identification of time-variant systems
NASA Astrophysics Data System (ADS)
Dziedziech, Kajetan; Czop, Piotr; Staszewski, Wieslaw J.; Uhl, Tadeusz
2018-03-01
Identification of systems, structures and machines with variable physical parameters is a challenging task especially when time-varying vibration modes are involved. The paper proposes a new combined, two-step - i.e. non-parametric and parametric - modelling approach in order to determine time-varying vibration modes based on input-output measurements. Single-degree-of-freedom (SDOF) vibration modes from multi-degree-of-freedom (MDOF) non-parametric system representation are extracted in the first step with the use of time-frequency wavelet-based filters. The second step involves time-varying parametric representation of extracted modes with the use of recursive linear autoregressive-moving-average with exogenous inputs (ARMAX) models. The combined approach is demonstrated using system identification analysis based on the experimental mass-varying MDOF frame-like structure subjected to random excitation. The results show that the proposed combined method correctly captures the dynamics of the analysed structure, using minimum a priori information on the model.
Dialect Density in Bilingual Puerto Rican Spanish-English Speaking Children
Fabiano-Smith, Leah; Shuriff, Rebecca; Barlow, Jessica A.; Goldstein, Brian A.
2014-01-01
It is still largely unknown how the two phonological systems of bilingual children interact. In this exploratory study, we examine children's use of dialect features to determine how their speech sound systems interact. Six monolingual Puerto Rican Spanish-speaking children and 6 bilingual Puerto Rican Spanish-English speaking children, ages 5-7 years, were included in the current study. Children's single word productions were analyzed for (1) dialect density and (2) frequency of occurrence of dialect features (after Oetting & McDonald, 2002). Nonparametric statistical analyses were used to examine differences within and across language groups. Results indicated that monolinguals and bilinguals exhibited similar dialect density, but differed on the types of dialect features used. Findings are discussed within the theoretical framework of the Dual Systems Model (Paradis, 2001) of language acquisition in bilingual children. PMID:25009677
Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs
2011-01-01
This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages.
NASA Astrophysics Data System (ADS)
Hastuti, S.; Harijono; Murtini, E. S.; Fibrianto, K.
2018-03-01
This current study is aimed to investigate the use of parametric and non-parametric approach for sensory RATA (Rate-All-That-Apply) method. Ledre as Bojonegoro unique local food product was used as point of interest, in which 319 panelists were involved in the study. The result showed that ledre is characterized as easy-crushed texture, sticky in mouth, stingy sensation and easy to swallow. It has also strong banana flavour with brown in colour. Compared to eggroll and semprong, ledre has more variances in terms of taste as well the roll length. As RATA questionnaire is designed to collect categorical data, non-parametric approach is the common statistical procedure. However, similar results were also obtained as parametric approach, regardless the fact of non-normal distributed data. Thus, it suggests that parametric approach can be applicable for consumer study with large number of respondents, even though it may not satisfy the assumption of ANOVA (Analysis of Variances).
Lu, Tao
2016-01-01
The gene regulation network (GRN) evaluates the interactions between genes and look for models to describe the gene expression behavior. These models have many applications; for instance, by characterizing the gene expression mechanisms that cause certain disorders, it would be possible to target those genes to block the progress of the disease. Many biological processes are driven by nonlinear dynamic GRN. In this article, we propose a nonparametric differential equation (ODE) to model the nonlinear dynamic GRN. Specially, we address following questions simultaneously: (i) extract information from noisy time course gene expression data; (ii) model the nonlinear ODE through a nonparametric smoothing function; (iii) identify the important regulatory gene(s) through a group smoothly clipped absolute deviation (SCAD) approach; (iv) test the robustness of the model against possible shortening of experimental duration. We illustrate the usefulness of the model and associated statistical methods through a simulation and a real application examples.
Linkage mapping of beta 2 EEG waves via non-parametric regression.
Ghosh, Saurabh; Begleiter, Henri; Porjesz, Bernice; Chorlian, David B; Edenberg, Howard J; Foroud, Tatiana; Goate, Alison; Reich, Theodore
2003-04-01
Parametric linkage methods for analyzing quantitative trait loci are sensitive to violations in trait distributional assumptions. Non-parametric methods are relatively more robust. In this article, we modify the non-parametric regression procedure proposed by Ghosh and Majumder [2000: Am J Hum Genet 66:1046-1061] to map Beta 2 EEG waves using genome-wide data generated in the COGA project. Significant linkage findings are obtained on chromosomes 1, 4, 5, and 15 with findings at multiple regions on chromosomes 4 and 15. We analyze the data both with and without incorporating alcoholism as a covariate. We also test for epistatic interactions between regions of the genome exhibiting significant linkage with the EEG phenotypes and find evidence of epistatic interactions between a region each on chromosome 1 and chromosome 4 with one region on chromosome 15. While regressing out the effect of alcoholism does not affect the linkage findings, the epistatic interactions become statistically insignificant. Copyright 2003 Wiley-Liss, Inc.
Vexler, Albert; Tanajian, Hovig; Hutson, Alan D
In practice, parametric likelihood-ratio techniques are powerful statistical tools. In this article, we propose and examine novel and simple distribution-free test statistics that efficiently approximate parametric likelihood ratios to analyze and compare distributions of K groups of observations. Using the density-based empirical likelihood methodology, we develop a Stata package that applies to a test for symmetry of data distributions and compares K -sample distributions. Recognizing that recent statistical software packages do not sufficiently address K -sample nonparametric comparisons of data distributions, we propose a new Stata command, vxdbel, to execute exact density-based empirical likelihood-ratio tests using K samples. To calculate p -values of the proposed tests, we use the following methods: 1) a classical technique based on Monte Carlo p -value evaluations; 2) an interpolation technique based on tabulated critical values; and 3) a new hybrid technique that combines methods 1 and 2. The third, cutting-edge method is shown to be very efficient in the context of exact-test p -value computations. This Bayesian-type method considers tabulated critical values as prior information and Monte Carlo generations of test statistic values as data used to depict the likelihood function. In this case, a nonparametric Bayesian method is proposed to compute critical values of exact tests.
Theodorsson-Norheim, E
1986-08-01
Multiple t tests at a fixed p level are frequently used to analyse biomedical data where analysis of variance followed by multiple comparisons or the adjustment of the p values according to Bonferroni would be more appropriate. The Kruskal-Wallis test is a nonparametric 'analysis of variance' which may be used to compare several independent samples. The present program is written in an elementary subset of BASIC and will perform Kruskal-Wallis test followed by multiple comparisons between the groups on practically any computer programmable in BASIC.
Nonparametric statistical modeling of binary star separations
NASA Technical Reports Server (NTRS)
Heacox, William D.; Gathright, John
1994-01-01
We develop a comprehensive statistical model for the distribution of observed separations in binary star systems, in terms of distributions of orbital elements, projection effects, and distances to systems. We use this model to derive several diagnostics for estimating the completeness of imaging searches for stellar companions, and the underlying stellar multiplicities. In application to recent imaging searches for low-luminosity companions to nearby M dwarf stars, and for companions to young stars in nearby star-forming regions, our analyses reveal substantial uncertainty in estimates of stellar multiplicity. For binary stars with late-type dwarf companions, semimajor axes appear to be distributed approximately as a(exp -1) for values ranging from about one to several thousand astronomical units. About one-quarter of the companions to field F and G dwarf stars have semimajor axes less than 1 AU, and about 15% lie beyond 1000 AU. The geometric efficiency (fraction of companions imaged onto the detector) of imaging searches is nearly independent of distances to program stars and orbital eccentricities, and varies only slowly with detector spatial limitations.
A New Index for the MMPI-2 Test for Detecting Dissimulation in Forensic Evaluations: A Pilot Study.
Martino, Vito; Grattagliano, Ignazio; Bosco, Andrea; Massaro, Ylenia; Lisi, Andrea; Campobasso, Filippo; Marchitelli, Maria Alessia; Catanesi, Roberto
2016-01-01
This pilot study is the starting point of a potentially broad research project aimed at identifying new strategies for assessing malingering during forensic evaluations. The forensic group was comprised of 67 males who were seeking some sort of certification (e.g., adoption, child custody, driver's license, issuance of gun permits, etc.); the nonforensic group was comprised of 62 healthy male volunteers. Each participant was administered the MMPI-2. Statistical analyses were conducted on obtained scores of 48 MMPI-2 scales. In the first step, parametric statistics were adopted to identify the best combination of MMPI-2 scales that differentiated the two groups of participants. In the second step, frequency-based, nonparametric methods were used for diagnostic purposes. A model that utilized the best three predictors ("7-Pt", "L," and "1-Hs") was developed and used to calculate the Forensic Evaluation Dissimulation Index (FEDI), which features satisfactory diagnostic accuracy (0.9), sensitivity (0.82), specificity (0.81), and likelihood ratio indices (LR+ = 4.32; LR- = 0.22). © 2015 American Academy of Forensic Sciences.
Genome-wide regression and prediction with the BGLR statistical package.
Pérez, Paulino; de los Campos, Gustavo
2014-10-01
Many modern genomic data analyses require implementing regressions where the number of parameters (p, e.g., the number of marker effects) exceeds sample size (n). Implementing these large-p-with-small-n regressions poses several statistical and computational challenges, some of which can be confronted using Bayesian methods. This approach allows integrating various parametric and nonparametric shrinkage and variable selection procedures in a unified and consistent manner. The BGLR R-package implements a large collection of Bayesian regression models, including parametric variable selection and shrinkage methods and semiparametric procedures (Bayesian reproducing kernel Hilbert spaces regressions, RKHS). The software was originally developed for genomic applications; however, the methods implemented are useful for many nongenomic applications as well. The response can be continuous (censored or not) or categorical (either binary or ordinal). The algorithm is based on a Gibbs sampler with scalar updates and the implementation takes advantage of efficient compiled C and Fortran routines. In this article we describe the methods implemented in BGLR, present examples of the use of the package, and discuss practical issues emerging in real-data analysis. Copyright © 2014 by the Genetics Society of America.
Nonparametric analysis of bivariate gap time with competing risks.
Huang, Chiung-Yu; Wang, Chenguang; Wang, Mei-Cheng
2016-09-01
This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall's tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article. © 2016, The International Biometric Society.
Updating estimates of low streamflow statistics to account for possible trends
NASA Astrophysics Data System (ADS)
Blum, A. G.; Archfield, S. A.; Hirsch, R. M.; Vogel, R. M.; Kiang, J. E.; Dudley, R. W.
2017-12-01
Given evidence of both increasing and decreasing trends in low flows in many streams, methods are needed to update estimators of low flow statistics used in water resources management. One such metric is the 10-year annual low-flow statistic (7Q10) calculated as the annual minimum seven-day streamflow which is exceeded in nine out of ten years on average. Historical streamflow records may not be representative of current conditions at a site if environmental conditions are changing. We present a new approach to frequency estimation under nonstationary conditions that applies a stationary nonparametric quantile estimator to a subset of the annual minimum flow record. Monte Carlo simulation experiments were used to evaluate this approach across a range of trend and no trend scenarios. Relative to the standard practice of using the entire available streamflow record, use of a nonparametric quantile estimator combined with selection of the most recent 30 or 50 years for 7Q10 estimation were found to improve accuracy and reduce bias. Benefits of data subset selection approaches were greater for higher magnitude trends annual minimum flow records with lower coefficients of variation. A nonparametric trend test approach for subset selection did not significantly improve upon always selecting the last 30 years of record. At 174 stream gages in the Chesapeake Bay region, 7Q10 estimators based on the most recent 30 years of flow record were compared to estimators based on the entire period of record. Given the availability of long records of low streamflow, using only a subset of the flow record ( 30 years) can be used to update 7Q10 estimators to better reflect current streamflow conditions.
Tangen, C M; Koch, G G
1999-03-01
In the randomized clinical trial setting, controlling for covariates is expected to produce variance reduction for the treatment parameter estimate and to adjust for random imbalances of covariates between the treatment groups. However, for the logistic regression model, variance reduction is not obviously obtained. This can lead to concerns about the assumptions of the logistic model. We introduce a complementary nonparametric method for covariate adjustment. It provides results that are usually compatible with expectations for analysis of covariance. The only assumptions required are based on randomization and sampling arguments. The resulting treatment parameter is a (unconditional) population average log-odds ratio that has been adjusted for random imbalance of covariates. Data from a randomized clinical trial are used to compare results from the traditional maximum likelihood logistic method with those from the nonparametric logistic method. We examine treatment parameter estimates, corresponding standard errors, and significance levels in models with and without covariate adjustment. In addition, we discuss differences between unconditional population average treatment parameters and conditional subpopulation average treatment parameters. Additional features of the nonparametric method, including stratified (multicenter) and multivariate (multivisit) analyses, are illustrated. Extensions of this methodology to the proportional odds model are also made.
ERIC Educational Resources Information Center
Yorke, Mantz
2017-01-01
When analysing course-level data by subgroups based upon some demographic characteristics, the numbers in analytical cells are often too small to allow inferences to be drawn that might help in the enhancement of practices. However, relatively simple analyses can provide useful pointers. This article draws upon a study involving a partnership with…
Guidelines for the design and statistical analysis of experiments in papers submitted to ATLA.
Festing, M F
2001-01-01
In vitro experiments need to be well designed and correctly analysed if they are to achieve their full potential to replace the use of animals in research. An "experiment" is a procedure for collecting scientific data in order to answer a hypothesis, or to provide material for generating new hypotheses, and differs from a survey because the scientist has control over the treatments that can be applied. Most experiments can be classified into one of a few formal designs, the most common being completely randomised, and randomised block designs. These are quite common with in vitro experiments, which are often replicated in time. Some experiments involve a single independent (treatment) variable, while other "factorial" designs simultaneously vary two or more independent variables, such as drug treatment and cell line. Factorial designs often provide additional information at little extra cost. Experiments need to be carefully planned to avoid bias, be powerful yet simple, provide for a valid statistical analysis and, in some cases, have a wide range of applicability. Virtually all experiments need some sort of statistical analysis in order to take account of biological variation among the experimental subjects. Parametric methods using the t test or analysis of variance are usually more powerful than non-parametric methods, provided the underlying assumptions of normality of the residuals and equal variances are approximately valid. The statistical analyses of data from a completely randomised design, and from a randomised-block design are demonstrated in Appendices 1 and 2, and methods of determining sample size are discussed in Appendix 3. Appendix 4 gives a checklist for authors submitting papers to ATLA.
In a previously published study, quantitative relationships were developed between landscape metrics and sediment contamination for 25 small estuarine systems within Chesapeake Bay. Nonparametric statistical analysis (rank transformation) was used to develop an empirical relation...
Order-restricted inference for means with missing values.
Wang, Heng; Zhong, Ping-Shou
2017-09-01
Missing values appear very often in many applications, but the problem of missing values has not received much attention in testing order-restricted alternatives. Under the missing at random (MAR) assumption, we impute the missing values nonparametrically using kernel regression. For data with imputation, the classical likelihood ratio test designed for testing the order-restricted means is no longer applicable since the likelihood does not exist. This article proposes a novel method for constructing test statistics for assessing means with an increasing order or a decreasing order based on jackknife empirical likelihood (JEL) ratio. It is shown that the JEL ratio statistic evaluated under the null hypothesis converges to a chi-bar-square distribution, whose weights depend on missing probabilities and nonparametric imputation. Simulation study shows that the proposed test performs well under various missing scenarios and is robust for normally and nonnormally distributed data. The proposed method is applied to an Alzheimer's disease neuroimaging initiative data set for finding a biomarker for the diagnosis of the Alzheimer's disease. © 2017, The International Biometric Society.
NASA Astrophysics Data System (ADS)
Romero, C.; McWilliam, M.; Macías-Pérez, J.-F.; Adam, R.; Ade, P.; André, P.; Aussel, H.; Beelen, A.; Benoît, A.; Bideaud, A.; Billot, N.; Bourrion, O.; Calvo, M.; Catalano, A.; Coiffard, G.; Comis, B.; de Petris, M.; Désert, F.-X.; Doyle, S.; Goupy, J.; Kramer, C.; Lagache, G.; Leclercq, S.; Lestrade, J.-F.; Mauskopf, P.; Mayet, F.; Monfardini, A.; Pascale, E.; Perotto, L.; Pisano, G.; Ponthieu, N.; Revéret, V.; Ritacco, A.; Roussel, H.; Ruppin, F.; Schuster, K.; Sievers, A.; Triqueneaux, S.; Tucker, C.; Zylka, R.
2018-04-01
Context. In the past decade, sensitive, resolved Sunyaev-Zel'dovich (SZ) studies of galaxy clusters have become common. Whereas many previous SZ studies have parameterized the pressure profiles of galaxy clusters, non-parametric reconstructions will provide insights into the thermodynamic state of the intracluster medium. Aim. We seek to recover the non-parametric pressure profiles of the high redshift (z = 0.89) galaxy cluster CLJ 1226.9+3332 as inferred from SZ data from the MUSTANG, NIKA, Bolocam, and Planck instruments, which all probe different angular scales. Methods: Our non-parametric algorithm makes use of logarithmic interpolation, which under the assumption of ellipsoidal symmetry is analytically integrable. For MUSTANG, NIKA, and Bolocam we derive a non-parametric pressure profile independently and find good agreement among the instruments. In particular, we find that the non-parametric profiles are consistent with a fitted generalized Navaro-Frenk-White (gNFW) profile. Given the ability of Planck to constrain the total signal, we include a prior on the integrated Compton Y parameter as determined by Planck. Results: For a given instrument, constraints on the pressure profile diminish rapidly beyond the field of view. The overlap in spatial scales probed by these four datasets is therefore critical in checking for consistency between instruments. By using multiple instruments, our analysis of CLJ 1226.9+3332 covers a large radial range, from the central regions to the cluster outskirts: 0.05 R500 < r < 1.1 R500. This is a wider range of spatial scales than is typically recovered by SZ instruments. Similar analyses will be possible with the new generation of SZ instruments such as NIKA2 and MUSTANG2.
Statistical inference for tumor growth inhibition T/C ratio.
Wu, Jianrong
2010-09-01
The tumor growth inhibition T/C ratio is commonly used to quantify treatment effects in drug screening tumor xenograft experiments. The T/C ratio is converted to an antitumor activity rating using an arbitrary cutoff point and often without any formal statistical inference. Here, we applied a nonparametric bootstrap method and a small sample likelihood ratio statistic to make a statistical inference of the T/C ratio, including both hypothesis testing and a confidence interval estimate. Furthermore, sample size and power are also discussed for statistical design of tumor xenograft experiments. Tumor xenograft data from an actual experiment were analyzed to illustrate the application.
Proceedings of the Conference on the Design of Experiments (23rd) S
1978-07-01
of Statistics, Carnegie-Mellon University. * [12] Duran , B. S . (1976). A survey of nonparametric tests for scale. Comunications in Statistics A5, 1287...the twenty-third Design of Experiments Conference was the U. S . Army Combat Development Experimentation Command, Fort Ord, California. Excellent...Availability Prof. G. E. P. Box Time Series Modelling University of Wisconsin Dr. Churchill Eisenhart was recipient this year of the Samuel S . Wilks Memorial
Juenger, Hendrik; Kuhnke, Nicola; Braun, Christoph; Ummenhofer, Frank; Wilke, Marko; Walther, Michael; Koerte, Inga; Delvendahl, Igor; Jung, Nikolai H; Berweck, Steffen; Staudt, Martin; Mall, Volker
2013-10-01
Early unilateral brain lesions can lead to a persistence of ipsilateral corticospinal projections from the contralesional hemisphere, which can enable the contralesional hemisphere to exert motor control over the paretic hand. In contrast to the primary motor representation (M1), the primary somatosensory representation (S1) of the paretic hand always remains in the lesioned hemisphere. Here, we report on differences in exercise-induced neuroplasticity between individuals with such ipsilateral motor projections (ipsi) and individuals with early unilateral lesions but 'healthy' contralateral motor projections (contra). Sixteen children and young adults with congenital hemiparesis participated in the study (contralateral [Contra] group: n=7, four females, three males; age range 10-30y, median age 16y; ipsilateral [Ipsi] group: n=9, four females, five males; age range 11-31y, median age 12y; Manual Ability Classification System levels I to II in all individuals in both groups). The participants underwent a 12-day intervention of constraint-induced movement therapy (CIMT), consisting of individual training (2h/d) and group training (8h/d). Before and after CIMT, hand function was tested using the Wolf Motor Function Test (WMFT) and diverging neuroplastic effects were observed by transcranial magnetic stimulation (TMS), functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG). Statistical analysis of TMS data was performed using the non-parametric Wilcoxon signed-rank test for pair-wise comparison; for fMRI standard statistical parametric and non-parametric mapping (SPM5, SnPM3) procedures (first level/second level) were carried out. Statistical analyses of MEG data involved analyses of variance (ANOVA) and t-tests. While MEG demonstrated a significant increase in S1 activation in both groups (p=0.012), TMS showed a decrease in M1 excitability in the Ipsi group (p=0.036), but an increase in M1 excitability in the Contra group (p=0.043). Similarly, fMRI showed a decrease in M1 activation in the Ipsi group, but an increase in activation in the M1-S1 region in the Contra group (for both groups p<0.001 [SnPM3] within the search volume). Different patterns of sensorimotor (re)organization in individuals with early unilateral lesions show, on a cortical level, different patterns of exercise-induced neuroplasticity. The findings help to improve the understanding of the general principles of sensorimotor learning and will help to develop more specific therapies for different pathologies in congenital hemiparesis. © 2013 Mac Keith Press.
Treatment of Selective Mutism: A Best-Evidence Synthesis.
ERIC Educational Resources Information Center
Stone, Beth Pionek; Kratochwill, Thomas R.; Sladezcek, Ingrid; Serlin, Ronald C.
2002-01-01
Presents systematic analysis of the major treatment approaches used for selective mutism. Based on nonparametric statistical tests of effect sizes, major findings include the following: treatment of selective mutism is more effective than no treatment; behaviorally oriented treatment approaches are more effective than no treatment; and no…
2008-08-01
DEMONSTRATOR’S FIELD PERSONNEL Geophysicist: Craig Hyslop Geophysicist: John Jacobsen Geophysicist: Rob Mehl 3.7 DEMONSTRATOR’S FIELD...Practical Nonparametric Statistics, W.J. Conover, John Wiley & Sons, 1980 , pages 144 through 151. APPENDIX F. ABBREVIATIONS F-1 (Page F-2
Cox, Tony; Popken, Douglas; Ricci, Paolo F
2013-01-01
Exposures to fine particulate matter (PM2.5) in air (C) have been suspected of contributing causally to increased acute (e.g., same-day or next-day) human mortality rates (R). We tested this causal hypothesis in 100 United States cities using the publicly available NMMAPS database. Although a significant, approximately linear, statistical C-R association exists in simple statistical models, closer analysis suggests that it is not causal. Surprisingly, conditioning on other variables that have been extensively considered in previous analyses (usually using splines or other smoothers to approximate their effects), such as month of the year and mean daily temperature, suggests that they create strong, nonlinear confounding that explains the statistical association between PM2.5 and mortality rates in this data set. As this finding disagrees with conventional wisdom, we apply several different techniques to examine it. Conditional independence tests for potential causation, non-parametric classification tree analysis, Bayesian Model Averaging (BMA), and Granger-Sims causality testing, show no evidence that PM2.5 concentrations have any causal impact on increasing mortality rates. This apparent absence of a causal C-R relation, despite their statistical association, has potentially important implications for managing and communicating the uncertain health risks associated with, but not necessarily caused by, PM2.5 exposures. PMID:23983662
Bardhan, Karna D; Cullis, James; Williams, Nigel R; Arasaradnam, Ramesh P; Wilson, Adrian J
2016-01-01
The visibility of the colon in positron emission tomography (PET) scans of patients without gastrointestinal disease indicating the presence of 18F Fluorodeoxyglucose (18FDG) is well recognised, but unquantified and unexplained. In this paper a qualitative scoring system was applied to PET scans from 30 randomly selected patients without gastrointestinal disease to detect the presence of 18FDG in 4 different sections of the colon and then both the total pixel value and the pixel value per unit length of each section of the colon were determined to quantify the amount of 18FDG from a randomly selected subset of 10 of these patients. Analysis of the qualitative scores using a non-parametric ANOVA showed that all sections of the colon contained 18FDG but there were differences in the amount of 18FDG present between sections (p<0.05). Wilcoxon matched-pair signed-rank tests between pairs of segments showed statistically significant differences between all pairs (p<0.05) with the exception of the caecum and ascending colon and the descending colon. The same non-parametric statistical analysis of the quantitative measures showed no difference in the total amount of 18FDG between sections (p>0.05), but a difference in the amount/unit length between sections (p<0.01) with only the caecum and ascending colon and the descending colon having a statistically significant difference (p<0.05). These results are consistent since the eye is drawn to focal localisation of the 18FDG when qualitatively scoring the scans. The presence of 18FDG in the colon is counterintuitive since it must be passing from the blood to the lumen through the colonic wall. There is no active mechanism to achieve this and therefore we hypothesise that the transport is a passive process driven by the concentration gradient of 18FDG across the colonic wall. This hypothesis is consistent with the results obtained from the qualitative and quantitative measures analysed.
Bardhan, Karna D.; Cullis, James; Williams, Nigel R.; Arasaradnam, Ramesh P.; Wilson, Adrian J.
2016-01-01
The visibility of the colon in positron emission tomography (PET) scans of patients without gastrointestinal disease indicating the presence of 18F Fluorodeoxyglucose (18FDG) is well recognised, but unquantified and unexplained. In this paper a qualitative scoring system was applied to PET scans from 30 randomly selected patients without gastrointestinal disease to detect the presence of 18FDG in 4 different sections of the colon and then both the total pixel value and the pixel value per unit length of each section of the colon were determined to quantify the amount of 18FDG from a randomly selected subset of 10 of these patients. Analysis of the qualitative scores using a non-parametric ANOVA showed that all sections of the colon contained 18FDG but there were differences in the amount of 18FDG present between sections (p<0.05). Wilcoxon matched-pair signed-rank tests between pairs of segments showed statistically significant differences between all pairs (p<0.05) with the exception of the caecum and ascending colon and the descending colon. The same non-parametric statistical analysis of the quantitative measures showed no difference in the total amount of 18FDG between sections (p>0.05), but a difference in the amount/unit length between sections (p<0.01) with only the caecum and ascending colon and the descending colon having a statistically significant difference (p<0.05). These results are consistent since the eye is drawn to focal localisation of the 18FDG when qualitatively scoring the scans. The presence of 18FDG in the colon is counterintuitive since it must be passing from the blood to the lumen through the colonic wall. There is no active mechanism to achieve this and therefore we hypothesise that the transport is a passive process driven by the concentration gradient of 18FDG across the colonic wall. This hypothesis is consistent with the results obtained from the qualitative and quantitative measures analysed. PMID:26821281
Kang, Le; Chen, Weijie; Petrick, Nicholas A; Gallas, Brandon D
2015-02-20
The area under the receiver operating characteristic curve is often used as a summary index of the diagnostic ability in evaluating biomarkers when the clinical outcome (truth) is binary. When the clinical outcome is right-censored survival time, the C index, motivated as an extension of area under the receiver operating characteristic curve, has been proposed by Harrell as a measure of concordance between a predictive biomarker and the right-censored survival outcome. In this work, we investigate methods for statistical comparison of two diagnostic or predictive systems, of which they could either be two biomarkers or two fixed algorithms, in terms of their C indices. We adopt a U-statistics-based C estimator that is asymptotically normal and develop a nonparametric analytical approach to estimate the variance of the C estimator and the covariance of two C estimators. A z-score test is then constructed to compare the two C indices. We validate our one-shot nonparametric method via simulation studies in terms of the type I error rate and power. We also compare our one-shot method with resampling methods including the jackknife and the bootstrap. Simulation results show that the proposed one-shot method provides almost unbiased variance estimations and has satisfactory type I error control and power. Finally, we illustrate the use of the proposed method with an example from the Framingham Heart Study. Copyright © 2014 John Wiley & Sons, Ltd.
Crema, Enrico R; Habu, Junko; Kobayashi, Kenichi; Madella, Marco
2016-01-01
Recent advances in the use of summed probability distribution (SPD) of calibrated 14C dates have opened new possibilities for studying prehistoric demography. The degree of correlation between climate change and population dynamics can now be accurately quantified, and divergences in the demographic history of distinct geographic areas can be statistically assessed. Here we contribute to this research agenda by reconstructing the prehistoric population change of Jomon hunter-gatherers between 7,000 and 3,000 cal BP. We collected 1,433 14C dates from three different regions in Eastern Japan (Kanto, Aomori and Hokkaido) and established that the observed fluctuations in the SPDs were statistically significant. We also introduced a new non-parametric permutation test for comparing multiple sets of SPDs that highlights point of divergences in the population history of different geographic regions. Our analyses indicate a general rise-and-fall pattern shared by the three regions but also some key regional differences during the 6th millennium cal BP. The results confirm some of the patterns suggested by previous archaeological studies based on house and site counts but offer statistical significance and an absolute chronological framework that will enable future studies aiming to establish potential correlation with climatic changes.
Habu, Junko; Kobayashi, Kenichi; Madella, Marco
2016-01-01
Recent advances in the use of summed probability distribution (SPD) of calibrated 14C dates have opened new possibilities for studying prehistoric demography. The degree of correlation between climate change and population dynamics can now be accurately quantified, and divergences in the demographic history of distinct geographic areas can be statistically assessed. Here we contribute to this research agenda by reconstructing the prehistoric population change of Jomon hunter-gatherers between 7,000 and 3,000 cal BP. We collected 1,433 14C dates from three different regions in Eastern Japan (Kanto, Aomori and Hokkaido) and established that the observed fluctuations in the SPDs were statistically significant. We also introduced a new non-parametric permutation test for comparing multiple sets of SPDs that highlights point of divergences in the population history of different geographic regions. Our analyses indicate a general rise-and-fall pattern shared by the three regions but also some key regional differences during the 6th millennium cal BP. The results confirm some of the patterns suggested by previous archaeological studies based on house and site counts but offer statistical significance and an absolute chronological framework that will enable future studies aiming to establish potential correlation with climatic changes. PMID:27128032
Effect of crowd size on patient volume at a large, multipurpose, indoor stadium.
De Lorenzo, R A; Gray, B C; Bennett, P C; Lamparella, V J
1989-01-01
A prediction of patient volume expected at "mass gatherings" is desirable in order to provide optimal on-site emergency medical care. While several methods of predicting patient loads have been suggested, a reliable technique has not been established. This study examines the frequency of medical emergencies at the Syracuse University Carrier Dome, a 50,500-seat indoor stadium. Patient volume and level of care at collegiate basketball and football games as well as rock concerts, over a 7-year period were examined and tabulated. This information was analyzed using simple regression and nonparametric statistical methods to determine level of correlation between crowd size and patient volume. These analyses demonstrated no statistically significant increase in patient volume for increasing crowd size for basketball and football events. There was a small but statistically significant increase in patient volume for increasing crowd size for concerts. A comparison of similar crowd size for each of the three events showed that patient frequency is greatest for concerts and smallest for basketball. The study suggests that crowd size alone has only a minor influence on patient volume at any given event. Structuring medical services based solely on expected crowd size and not considering other influences such as event type and duration may give poor results.
Assessment of Communications-related Admissions Criteria in a Three-year Pharmacy Program
Tejada, Frederick R.; Lang, Lynn A.; Purnell, Miriam; Acedera, Lisa; Ngonga, Ferdinand
2015-01-01
Objective. To determine if there is a correlation between TOEFL and other admissions criteria that assess communications skills (ie, PCAT variables: verbal, reading, essay, and composite), interview, and observational scores and to evaluate TOEFL and these admissions criteria as predictors of academic performance. Methods. Statistical analyses included two sample t tests, multiple regression and Pearson’s correlations for parametric variables, and Mann-Whitney U for nonparametric variables, which were conducted on the retrospective data of 162 students, 57 of whom were foreign-born. Results. The multiple regression model of the other admissions criteria on TOEFL was significant. There was no significant correlation between TOEFL scores and academic performance. However, significant correlations were found between the other admissions criteria and academic performance. Conclusion. Since TOEFL is not a significant predictor of either communication skills or academic success of foreign-born PharmD students in the program, it may be eliminated as an admissions criterion. PMID:26430273
Assessment of Communications-related Admissions Criteria in a Three-year Pharmacy Program.
Parmar, Jayesh R; Tejada, Frederick R; Lang, Lynn A; Purnell, Miriam; Acedera, Lisa; Ngonga, Ferdinand
2015-08-25
To determine if there is a correlation between TOEFL and other admissions criteria that assess communications skills (ie, PCAT variables: verbal, reading, essay, and composite), interview, and observational scores and to evaluate TOEFL and these admissions criteria as predictors of academic performance. Statistical analyses included two sample t tests, multiple regression and Pearson's correlations for parametric variables, and Mann-Whitney U for nonparametric variables, which were conducted on the retrospective data of 162 students, 57 of whom were foreign-born. The multiple regression model of the other admissions criteria on TOEFL was significant. There was no significant correlation between TOEFL scores and academic performance. However, significant correlations were found between the other admissions criteria and academic performance. Since TOEFL is not a significant predictor of either communication skills or academic success of foreign-born PharmD students in the program, it may be eliminated as an admissions criterion.
Oostenveld, Robert; Fries, Pascal; Maris, Eric; Schoffelen, Jan-Mathijs
2011-01-01
This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages. PMID:21253357
Tsai, Jack; Rosenheck, Robert A
2015-06-01
There has long been concern that public support payments are used to support addictive behaviors. This study examined the amount of money homeless veterans spend on alcohol and drugs and the association between public support income, including VA disability compensation, and expenditures on alcohol and drugs. Data were from 1,160 veterans from 19 sites on entry into the Housing and Urban Development-Veterans Affairs Supportive Housing program. Descriptive statistics and nonparametric analyses were conducted. About 33% of veterans reported spending money on alcohol and 22% reported spending money on drugs in the past month. No significant association was found between public support income, VA disability compensation, and money spent on alcohol and drugs. A substantial proportion of homeless veterans spend some income on alcohol and drugs, but disability income, including VA compensation, does not seem to be related to substance use or money spent on addictive substances.
Mothers' physical interventions in toddler play in a low-income, African American sample.
Ispa, Jean M; Claire Cook, J; Harmeyer, Erin; Rudy, Duane
2015-11-01
This mixed method study examined 28 low-income African American mothers' physical interventions in their 14-month-old toddlers' play. Inductive methods were used to identify six physical intervention behaviors, the affect accompanying physical interventions, and apparent reasons for intervening. Nonparametric statistical analyses determined that toddlers experienced physical intervention largely in the context of positive maternal affect. Mothers of boys expressed highly positive affect while physically intervening more than mothers of girls. Most physically intervening acts seemed to be motivated by maternal intent to show or tell children how to play or to correct play deemed incorrect. Neutral affect was the most common toddler affect type following physical intervention, but boys were more likely than girls to be upset immediately after physical interventions. Physical interventions intended to protect health and safety seemed the least likely to elicit toddler upset. Copyright © 2015 Elsevier Inc. All rights reserved.
It's time to move on from the bell curve.
Robinson, Lawrence R
2017-11-01
The bell curve was first described in the 18th century by de Moivre and Gauss to depict the distribution of binomial events, such as coin tossing, or repeated measures of physical objects. In the 19th and 20th centuries, the bell curve was appropriated, or perhaps misappropriated, to apply to biologic and social measures across people. For many years we used it to derive reference values for our electrophysiologic studies. There is, however, no reason to believe that electrophysiologic measures should approximate a bell-curve distribution, and empiric evidence suggests they do not. The concept of using mean ± 2 standard deviations should be abandoned. Reference values are best derived by using non-parametric analyses, such as percentile values. This proposal aligns with the recommendation of the recent normative data task force of the American Association of Neuromuscular & Electrodiagnostic Medicine and follows sound statistical principles. Muscle Nerve 56: 859-860, 2017. © 2017 Wiley Periodicals, Inc.
The Evolution of Your Success Lies at the Centre of Your Co-Authorship Network
Servia-Rodríguez, Sandra; Noulas, Anastasios; Mascolo, Cecilia; Fernández-Vilas, Ana; Díaz-Redondo, Rebeca P.
2015-01-01
Collaboration among scholars and institutions is progressively becoming essential to the success of research grant procurement and to allow the emergence and evolution of scientific disciplines. Our work focuses on analysing if the volume of collaborations of one author together with the relevance of his collaborators is somewhat related to his research performance over time. In order to prove this relation we collected the temporal distributions of scholars’ publications and citations from the Google Scholar platform and the co-authorship network (of Computer Scientists) underlying the well-known DBLP bibliographic database. By the application of time series clustering, social network analysis and non-parametric statistics, we observe that scholars with similar publications (citations) patterns also tend to have a similar centrality in the co-authorship network. To our knowledge, this is the first work that considers success evolution with respect to co-authorship. PMID:25760732
Describing spatial pattern in stream networks: A practical approach
Ganio, L.M.; Torgersen, C.E.; Gresswell, R.E.
2005-01-01
The shape and configuration of branched networks influence ecological patterns and processes. Recent investigations of network influences in riverine ecology stress the need to quantify spatial structure not only in a two-dimensional plane, but also in networks. An initial step in understanding data from stream networks is discerning non-random patterns along the network. On the other hand, data collected in the network may be spatially autocorrelated and thus not suitable for traditional statistical analyses. Here we provide a method that uses commercially available software to construct an empirical variogram to describe spatial pattern in the relative abundance of coastal cutthroat trout in headwater stream networks. We describe the mathematical and practical considerations involved in calculating a variogram using a non-Euclidean distance metric to incorporate the network pathway structure in the analysis of spatial variability, and use a non-parametric technique to ascertain if the pattern in the empirical variogram is non-random.
A geostatistical approach for describing spatial pattern in stream networks
Ganio, L.M.; Torgersen, C.E.; Gresswell, R.E.
2005-01-01
The shape and configuration of branched networks influence ecological patterns and processes. Recent investigations of network influences in riverine ecology stress the need to quantify spatial structure not only in a two-dimensional plane, but also in networks. An initial step in understanding data from stream networks is discerning non-random patterns along the network. On the other hand, data collected in the network may be spatially autocorrelated and thus not suitable for traditional statistical analyses. Here we provide a method that uses commercially available software to construct an empirical variogram to describe spatial pattern in the relative abundance of coastal cutthroat trout in headwater stream networks. We describe the mathematical and practical considerations involved in calculating a variogram using a non-Euclidean distance metric to incorporate the network pathway structure in the analysis of spatial variability, and use a non-parametric technique to ascertain if the pattern in the empirical variogram is non-random.
Sarkar, Rajarshi
2013-08-23
Although TSH measurement by electrochemiluminescence immunoassay has become commonplace in India, significant discrepancy has been observed on interpretation of the test results when the manufacturer supplied biological reference interval (BRI) criteria were applied. This report determined whether the manufacturer's BRI (Roche Cobas) is transferable to the Indian population. Three hundred seventy-eight age- and sex-matched healthy subjects were selected from an urban Indian population. TSH reference measurements were acquired, and the reference data were statistically analysed. BRI of the Indian urban reference population was determined by non-parametric means. BRI was found to be 1.134 to 7.280μIU/ml. BRI thus calculated was found to be significantly different from that mentioned by the manufacturer (0.27 to 4.20μIU/ml), which, needless to mention, has profound clinical implications in this part of the globe. Copyright © 2013 Elsevier B.V. All rights reserved.
Maity, Arnab; Carroll, Raymond J; Mammen, Enno; Chatterjee, Nilanjan
2009-01-01
Motivated from the problem of testing for genetic effects on complex traits in the presence of gene-environment interaction, we develop score tests in general semiparametric regression problems that involves Tukey style 1 degree-of-freedom form of interaction between parametrically and non-parametrically modelled covariates. We find that the score test in this type of model, as recently developed by Chatterjee and co-workers in the fully parametric setting, is biased and requires undersmoothing to be valid in the presence of non-parametric components. Moreover, in the presence of repeated outcomes, the asymptotic distribution of the score test depends on the estimation of functions which are defined as solutions of integral equations, making implementation difficult and computationally taxing. We develop profiled score statistics which are unbiased and asymptotically efficient and can be performed by using standard bandwidth selection methods. In addition, to overcome the difficulty of solving functional equations, we give easy interpretations of the target functions, which in turn allow us to develop estimation procedures that can be easily implemented by using standard computational methods. We present simulation studies to evaluate type I error and power of the method proposed compared with a naive test that does not consider interaction. Finally, we illustrate our methodology by analysing data from a case-control study of colorectal adenoma that was designed to investigate the association between colorectal adenoma and the candidate gene NAT2 in relation to smoking history.
A Deterministic Annealing Approach to Clustering AIRS Data
NASA Technical Reports Server (NTRS)
Guillaume, Alexandre; Braverman, Amy; Ruzmaikin, Alexander
2012-01-01
We will examine the validity of means and standard deviations as a basis for climate data products. We will explore the conditions under which these two simple statistics are inadequate summaries of the underlying empirical probability distributions by contrasting them with a nonparametric, method called Deterministic Annealing technique
Computer Games: Increase Learning in an Interactive Multidisciplinary Environment.
ERIC Educational Resources Information Center
Betz, Joseph A.
1996-01-01
Discusses the educational uses of computer games and simulations and describes a study conducted at the State University of New York College at Farmingdale that used the computer game "Sim City 2000." Highlights include whole systems learning, problem solving, student performance, nonparametric statistics, and treatment of experimental…
2008-09-01
DEMONSTRATOR’S FIELD PERSONNEL Geophysicist: Craig Hyslop Geophysicist: John Jacobsen Geophysicist: Rob Mehl 3.7 DEMONSTRATOR’S FIELD SURVEYING...Yuma Proving Ground Soil Survey Report, May 2003. 5. Practical Nonparametric Statistics, W.J. Conover, John Wiley & Sons, 1980 , pages 144 through
Exploring Rating Quality in Rater-Mediated Assessments Using Mokken Scale Analysis
ERIC Educational Resources Information Center
Wind, Stefanie A.; Engelhard, George, Jr.
2016-01-01
Mokken scale analysis is a probabilistic nonparametric approach that offers statistical and graphical tools for evaluating the quality of social science measurement without placing potentially inappropriate restrictions on the structure of a data set. In particular, Mokken scaling provides a useful method for evaluating important measurement…
Quality Improvement: Does the Air Force Systems Command Practice What It Preaches
1990-03-01
without his assistance in getting supplies, computers, and plotters. Another special thanks goes to my committee chairman. Dr Stephen Blank. who provided...N.J.: Prentice-Hall. 1986). 166. 5. Ibid.. 181. 6. Sidney Siegel. Nonparametric Statistics for the Behavioral Sciences (New York: Mc- Graw -Hill. 1956
Macmillan, N A; Creelman, C D
1996-06-01
Can accuracy and response bias in two-stimulus, two-response recognition or detection experiments be measured nonparametrically? Pollack and Norman (1964) answered this question affirmatively for sensitivity, Hodos (1970) for bias: Both proposed measures based on triangular areas in receiver-operating characteristic space. Their papers, and especially a paper by Grier (1971) that provided computing formulas for the measures, continue to be heavily cited in a wide range of content areas. In our sample of articles, most authors described triangle-based measures as making fewer assumptions than measures associated with detection theory. However, we show that statistics based on products or ratios of right triangle areas, including a recently proposed bias index and a not-yetproposed but apparently plausible sensitivity index, are consistent with a decision process based on logistic distributions. Even the Pollack and Norman measure, which is based on non-right triangles, is approximately logistic for low values of sensitivity. Simple geometric models for sensitivity and bias are not nonparametric, even if their implications are not acknowledged in the defining publications.
Nonparametric evaluation of birth cohort trends in disease rates.
Tarone, R E; Chu, K C
2000-01-01
Although interpretation of age-period-cohort analyses is complicated by the non-identifiability of maximum likelihood estimates, changes in the slope of the birth-cohort effect curve are identifiable and have potential aetiologic significance. A nonparametric test for a change in the slope of the birth-cohort trend has been developed. The test is a generalisation of the sign test and is based on permutational distributions. A method for identifying interactions between age and calendar-period effects is also presented. The nonparametric method is shown to be powerful in detecting changes in the slope of the birth-cohort trend, although its power can be reduced considerably by calendar-period patterns of risk. The method identifies a previously unidentified decrease in the birth-cohort risk of lung-cancer mortality from 1912 to 1919, which appears to reflect a reduction in the initiation of smoking by young men at the beginning of the Great Depression (1930s). The method also detects an interaction between age and calendar period in leukemia mortality rates, reflecting the better response of children to chemotherapy. The proposed nonparametric method provides a data analytic approach, which is a useful adjunct to log-linear Poisson analysis of age-period-cohort models, either in the initial model building stage, or in the final interpretation stage.
[Do we always correctly interpret the results of statistical nonparametric tests].
Moczko, Jerzy A
2014-01-01
Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.
Nonparametric Methods in Astronomy: Think, Regress, Observe—Pick Any Three
NASA Astrophysics Data System (ADS)
Steinhardt, Charles L.; Jermyn, Adam S.
2018-02-01
Telescopes are much more expensive than astronomers, so it is essential to minimize required sample sizes by using the most data-efficient statistical methods possible. However, the most commonly used model-independent techniques for finding the relationship between two variables in astronomy are flawed. In the worst case they can lead without warning to subtly yet catastrophically wrong results, and even in the best case they require more data than necessary. Unfortunately, there is no single best technique for nonparametric regression. Instead, we provide a guide for how astronomers can choose the best method for their specific problem and provide a python library with both wrappers for the most useful existing algorithms and implementations of two new algorithms developed here.
Statistical Techniques to Analyze Pesticide Data Program Food Residue Observations.
Szarka, Arpad Z; Hayworth, Carol G; Ramanarayanan, Tharacad S; Joseph, Robert S I
2018-06-26
The U.S. EPA conducts dietary-risk assessments to ensure that levels of pesticides on food in the U.S. food supply are safe. Often these assessments utilize conservative residue estimates, maximum residue levels (MRLs), and a high-end estimate derived from registrant-generated field-trial data sets. A more realistic estimate of consumers' pesticide exposure from food may be obtained by utilizing residues from food-monitoring programs, such as the Pesticide Data Program (PDP) of the U.S. Department of Agriculture. A substantial portion of food-residue concentrations in PDP monitoring programs are below the limits of detection (left-censored), which makes the comparison of regulatory-field-trial and PDP residue levels difficult. In this paper, we present a novel adaption of established statistical techniques, the Kaplan-Meier estimator (K-M), the robust regression on ordered statistic (ROS), and the maximum-likelihood estimator (MLE), to quantify the pesticide-residue concentrations in the presence of heavily censored data sets. The examined statistical approaches include the most commonly used parametric and nonparametric methods for handling left-censored data that have been used in the fields of medical and environmental sciences. This work presents a case study in which data of thiamethoxam residue on bell pepper generated from registrant field trials were compared with PDP-monitoring residue values. The results from the statistical techniques were evaluated and compared with commonly used simple substitution methods for the determination of summary statistics. It was found that the maximum-likelihood estimator (MLE) is the most appropriate statistical method to analyze this residue data set. Using the MLE technique, the data analyses showed that the median and mean PDP bell pepper residue levels were approximately 19 and 7 times lower, respectively, than the corresponding statistics of the field-trial residues.
A support vector machine based test for incongruence between sets of trees in tree space
2012-01-01
Background The increased use of multi-locus data sets for phylogenetic reconstruction has increased the need to determine whether a set of gene trees significantly deviate from the phylogenetic patterns of other genes. Such unusual gene trees may have been influenced by other evolutionary processes such as selection, gene duplication, or horizontal gene transfer. Results Motivated by this problem we propose a nonparametric goodness-of-fit test for two empirical distributions of gene trees, and we developed the software GeneOut to estimate a p-value for the test. Our approach maps trees into a multi-dimensional vector space and then applies support vector machines (SVMs) to measure the separation between two sets of pre-defined trees. We use a permutation test to assess the significance of the SVM separation. To demonstrate the performance of GeneOut, we applied it to the comparison of gene trees simulated within different species trees across a range of species tree depths. Applied directly to sets of simulated gene trees with large sample sizes, GeneOut was able to detect very small differences between two set of gene trees generated under different species trees. Our statistical test can also include tree reconstruction into its test framework through a variety of phylogenetic optimality criteria. When applied to DNA sequence data simulated from different sets of gene trees, results in the form of receiver operating characteristic (ROC) curves indicated that GeneOut performed well in the detection of differences between sets of trees with different distributions in a multi-dimensional space. Furthermore, it controlled false positive and false negative rates very well, indicating a high degree of accuracy. Conclusions The non-parametric nature of our statistical test provides fast and efficient analyses, and makes it an applicable test for any scenario where evolutionary or other factors can lead to trees with different multi-dimensional distributions. The software GeneOut is freely available under the GNU public license. PMID:22909268
Domingues, Eduardo Pinheiro; Ribeiro, Rafael Fernandes; Horta, Martinho Campolina Rebello; Manzi, Flávio Ricardo; Côsso, Maurício Greco; Zenóbio, Elton Gonçalves
2017-10-01
Using computed tomography, to compare vertical and volumetric bone augmentation after interposition grafting with bovine bone mineral matrix (GEISTLICH BIO-OSS ® ) or hydroxyapatite/tricalcium phosphate (STRAUMANN ® BONECERAMIC) for atrophic posterior mandible reconstruction through segmental osteotomy. Seven patients received interposition grafts in the posterior mandible for implant rehabilitation. The computed tomography cone beam images were analysed with OsiriX Imaging Software 6.5 (Pixmeo Geneva, Switzerland) in the pre-surgical period (T0), at 15 days post-surgery (T1) and at 180 days post-surgery (T2). The tomographic analysis was performed by a single trained and calibrated radiologist. Descriptive statistics and nonparametric methods were used to analyse the data. There was a significant difference in vertical and volume augmentation with both biomaterials using the technique (P < 0.05). There were no significant differences (P > 0.05) in volume change of the graft, bone volume augmentation, or augmentation of the maximum linear vertical distance between the two analysed biomaterials. The GEISTLICH BIO-OSS ® and STRAUMANN ® BONECERAMIC interposition grafts exhibited similar and sufficient dimensional stability and volume gain for short implants in the atrophic posterior mandible. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Schulz, Marcus; Clemens, Thomas; Förster, Harald; Harder, Thorsten; Fleet, David; Gaus, Silvia; Grave, Christel; Flegel, Imme; Schrey, Eckart; Hartwig, Eike
2015-08-01
In the North Sea, the amount of litter present in the marine environment represents a severe environmental problem. In order to assess the magnitude of the problem and measure changes in abundance, the results of two beach litter monitoring programmes were compared and analysed for long-term trends applying multivariate techniques. Total beach litter pollution was persistently high. Spatial differences in litter abundance made it difficult to identify long-term trends: Partly more than 8000 litter items year(-1) were recorded on a 100 m long survey site on the island of Scharhörn, while the survey site on the beach on the island of Amrum revealed abundances lower by two orders of magnitude. Beach litter was dominated by plastic with mean proportions of 52%-91% of total beach litter. Non-parametric time series analyses detected many significant trends, which, however, did not show any systematic spatial patterns. Cluster analyses partly led to groupings of beaches according to their expositions to sources of litter, wind and currents. Surveys in short intervals of one to two weeks were found to give higher annual sums of beach litter than the quarterly surveys of the OSPAR method. Surveys at regular intervals of four weeks to five months would make monitoring results more reliable. Copyright © 2015 Elsevier Ltd. All rights reserved.
Twenty-five years of maximum-entropy principle
NASA Astrophysics Data System (ADS)
Kapur, J. N.
1983-04-01
The strengths and weaknesses of the maximum entropy principle (MEP) are examined and some challenging problems that remain outstanding at the end of the first quarter century of the principle are discussed. The original formalism of the MEP is presented and its relationship to statistical mechanics is set forth. The use of MEP for characterizing statistical distributions, in statistical inference, nonlinear spectral analysis, transportation models, population density models, models for brand-switching in marketing and vote-switching in elections is discussed. Its application to finance, insurance, image reconstruction, pattern recognition, operations research and engineering, biology and medicine, and nonparametric density estimation is considered.
A Model Fit Statistic for Generalized Partial Credit Model
ERIC Educational Resources Information Center
Liang, Tie; Wells, Craig S.
2009-01-01
Investigating the fit of a parametric model is an important part of the measurement process when implementing item response theory (IRT), but research examining it is limited. A general nonparametric approach for detecting model misfit, introduced by J. Douglas and A. S. Cohen (2001), has exhibited promising results for the two-parameter logistic…
2008-09-01
day timeframe. 3.6 DEMONSTRATOR’S FIELD PERSONNEL Geophysicist: Craig Hyslop Geophysicist: John Jacobsen Geophysicist: Rob Mehl 3.7...Practical Nonparametric Statistics, W.J. Conover, John Wiley & Sons, 1980 , pages 144 through 151. F-1 (Page F-2 Blank) APPENDIX F
Learning Patterns as Criterion for Forming Work Groups in 3D Simulation Learning Environments
ERIC Educational Resources Information Center
Maria Cela-Ranilla, Jose; Molías, Luis Marqués; Cervera, Mercè Gisbert
2016-01-01
This study analyzes the relationship between the use of learning patterns as a grouping criterion to develop learning activities in the 3D simulation environment at University. Participants included 72 Spanish students from the Education and Marketing disciplines. Descriptive statistics and non-parametric tests were conducted. The process was…
Using R to Simulate Permutation Distributions for Some Elementary Experimental Designs
ERIC Educational Resources Information Center
Eudey, T. Lynn; Kerr, Joshua D.; Trumbo, Bruce E.
2010-01-01
Null distributions of permutation tests for two-sample, paired, and block designs are simulated using the R statistical programming language. For each design and type of data, permutation tests are compared with standard normal-theory and nonparametric tests. These examples (often using real data) provide for classroom discussion use of metrics…
Temporal changes and variability in temperature series over Peninsular Malaysia
NASA Astrophysics Data System (ADS)
Suhaila, Jamaludin
2015-02-01
With the current concern over climate change, the descriptions on how temperature series changed over time are very useful. Annual mean temperature has been analyzed for several stations over Peninsular Malaysia. Non-parametric statistical techniques such as Mann-Kendall test and Theil-Sen slope estimation are used primarily for assessing the significance and detection of trends, while a nonparametric Pettitt's test and sequential Mann-Kendall test are adopted to detect any abrupt climate change. Statistically significance increasing trends for annual mean temperature are detected for almost all studied stations with the magnitude of significant trend varied from 0.02°C to 0.05°C per year. The results shows that climate over Peninsular Malaysia is getting warmer than before. In addition, the results of the abrupt changes in temperature using Pettitt's and sequential Mann-Kendall test reveal the beginning of trends which can be related to El Nino episodes that occur in Malaysia. In general, the analysis results can help local stakeholders and water managers to understand the risks and vulnerabilities related to climate change in terms of mean events in the region.
Non-parametric early seizure detection in an animal model of temporal lobe epilepsy
NASA Astrophysics Data System (ADS)
Talathi, Sachin S.; Hwang, Dong-Uk; Spano, Mark L.; Simonotto, Jennifer; Furman, Michael D.; Myers, Stephen M.; Winters, Jason T.; Ditto, William L.; Carney, Paul R.
2008-03-01
The performance of five non-parametric, univariate seizure detection schemes (embedding delay, Hurst scale, wavelet scale, nonlinear autocorrelation and variance energy) were evaluated as a function of the sampling rate of EEG recordings, the electrode types used for EEG acquisition, and the spatial location of the EEG electrodes in order to determine the applicability of the measures in real-time closed-loop seizure intervention. The criteria chosen for evaluating the performance were high statistical robustness (as determined through the sensitivity and the specificity of a given measure in detecting a seizure) and the lag in seizure detection with respect to the seizure onset time (as determined by visual inspection of the EEG signal by a trained epileptologist). An optimality index was designed to evaluate the overall performance of each measure. For the EEG data recorded with microwire electrode array at a sampling rate of 12 kHz, the wavelet scale measure exhibited better overall performance in terms of its ability to detect a seizure with high optimality index value and high statistics in terms of sensitivity and specificity.
The Dundee Ready Education Environment Measure (DREEM): a review of its adoption and use.
Miles, Susan; Swift, Louise; Leinster, Sam J
2012-01-01
The Dundee Ready Education Environment Measure (DREEM) was published in 1997 as a tool to evaluate educational environments of medical schools and other health training settings and a recent review concluded that it was the most suitable such instrument. This study aimed to review the settings and purposes to which the DREEM has been applied and the approaches used to analyse and report it, with a view to guiding future users towards appropriate methodology. A systematic literature review was conducted using the Web of Knowledge databases of all articles reporting DREEM data between 1997 and 4 January 2011. The review found 40 publications, using data from 20 countries. DREEM is used in evaluation for diagnostic purposes, comparison between different groups and comparison with ideal/expected scores. A variety of non-parametric and parametric statistical methods have been applied, but their use is inconsistent. DREEM has been used internationally for different purposes and is regarded as a useful tool by users. However, reporting and analysis differs between publications. This lack of uniformity makes comparison between institutions difficult. Most users of DREEM are not statisticians and there is a need for informed guidelines on its reporting and statistical analysis.
Malaria resurgence in the East African highlands: Temperature trends revisited
Pascual, M.; Ahumada, J. A.; Chaves, L. F.; Rodó, X.; Bouma, M.
2006-01-01
The incidence of malaria in the East African highlands has increased since the end of the 1970s. The role of climate change in the exacerbation of the disease has been controversial, and the specific influence of rising temperature (warming) has been highly debated following a previous study reporting no evidence to support a trend in temperature. We revisit this result using the same temperature data, now updated to the present from 1950 to 2002 for four high-altitude sites in East Africa where malaria has become a serious public health problem. With both nonparametric and parametric statistical analyses, we find evidence for a significant warming trend at all sites. To assess the biological significance of this trend, we drive a dynamical model for the population dynamics of the mosquito vector with the temperature time series and the corresponding detrended versions. This approach suggests that the observed temperature changes would be significantly amplified by the mosquito population dynamics with a difference in the biological response at least 1 order of magnitude larger than that in the environmental variable. Our results emphasize the importance of considering not just the statistical significance of climate trends but also their biological implications with dynamical models. PMID:16571662
NASA Astrophysics Data System (ADS)
Rougier, Jonty; Cashman, Kathy; Sparks, Stephen
2016-04-01
We have analysed the Large Magnitude Explosive Volcanic Eruptions database (LaMEVE) for volcanoes that classify as stratovolcanoes. A non-parametric statistical approach is used to assess the global recording rate for large (M4+). The approach imposes minimal structure on the shape of the recording rate through time. We find that the recording rates have declined rapidly, going backwards in time. Prior to 1600 they are below 50%, and prior to 1100 they are below 20%. Even in the recent past, e.g. the 1800s, they are likely to be appreciably less than 100%.The assessment for very large (M5+) eruptions is more uncertain, due to the scarcity of events. Having taken under-recording into account the large-eruption rates of stratovolcanoes are modelled exchangeably, in order to derive an informative prior distribution as an input into a subsequent volcano-by-volcano hazard assessment. The statistical model implies that volcano-by-volcano predictions can be grouped by the number of recorded large eruptions. Further, it is possible to combine all volcanoes together into a global large eruption prediction, with an M4+ rate computed from the LaMEVE database of 0.57/yr.
Effect of the maternity ward system on the lactation success of low-income urban Mexican women.
Perez-Escamilla, R; Segura-Millán, S; Pollitt, E; Dewey, K G
1992-11-01
We compared the lactation performance of 165 healthy mothers who planned to breastfeed and gave birth by vaginal delivery, without complications to a healthy infant in either a nursery (NUR) (n = 58) or a rooming-in hospital where formula supplementation was not allowed. In the rooming-in hospital, women were randomly assigned to a group that received breastfeeding guidance during the hospital stay (RIBFG) (n = 53) or to a control group (RI) (n = 54). Women were interviewed in the hospital and at 8, 70 and 135 days post-partum (pp). The groups were similar in socio-economic, demographic, anthropometric, previous breastfeeding experience and prenatal care variables. Non-parametric survival analyses adjusting for potential confounding factors show that breastfeeding guidance had a positive impact (P < or = 0.05) on breastfeeding duration among primiparous women who delivered in the rooming-in hospital. Among primiparae, the RI and RIBFG groups had higher (P < or = 0.05) full breastfeeding rates than the NUR group in the short term. In the longer term, only the difference between the RIBFG and the NUR group remained statistically significant. The maternity ward system did not have a statistically significant effect on the lactation performance of multiparae.
Assessment of disinfection of hospital surfaces using different monitoring methods1
Ferreira, Adriano Menis; de Andrade, Denise; Rigotti, Marcelo Alessandro; de Almeida, Margarete Teresa Gottardo; Guerra, Odanir Garcia; dos Santos, Aires Garcia
2015-01-01
OBJECTIVE: to assess the efficiency of cleaning/disinfection of surfaces of an Intensive Care Unit. METHOD: descriptive-exploratory study with quantitative approach conducted over the course of four weeks. Visual inspection, bioluminescence adenosine triphosphate and microbiological indicators were used to indicate cleanliness/disinfection. Five surfaces (bed rails, bedside tables, infusion pumps, nurses' counter, and medical prescription table) were assessed before and after the use of rubbing alcohol at 70% (w/v), totaling 160 samples for each method. Non-parametric tests were used considering statistically significant differences at p<0.05. RESULTS: after the cleaning/disinfection process, 87.5, 79.4 and 87.5% of the surfaces were considered clean using the visual inspection, bioluminescence adenosine triphosphate and microbiological analyses, respectively. A statistically significant decrease was observed in the disapproval rates after the cleaning process considering the three assessment methods; the visual inspection was the least reliable. CONCLUSION: the cleaning/disinfection method was efficient in reducing microbial load and organic matter of surfaces, however, these findings require further study to clarify aspects related to the efficiency of friction, its frequency, and whether or not there is association with other inputs to achieve improved results of the cleaning/disinfection process. PMID:26312634
Assessment of disinfection of hospital surfaces using different monitoring methods.
Ferreira, Adriano Menis; de Andrade, Denise; Rigotti, Marcelo Alessandro; de Almeida, Margarete Teresa Gottardo; Guerra, Odanir Garcia; dos Santos Junior, Aires Garcia
2015-01-01
to assess the efficiency of cleaning/disinfection of surfaces of an Intensive Care Unit. descriptive-exploratory study with quantitative approach conducted over the course of four weeks. Visual inspection, bioluminescence adenosine triphosphate and microbiological indicators were used to indicate cleanliness/disinfection. Five surfaces (bed rails, bedside tables, infusion pumps, nurses' counter, and medical prescription table) were assessed before and after the use of rubbing alcohol at 70% (w/v), totaling 160 samples for each method. Non-parametric tests were used considering statistically significant differences at p<0.05. after the cleaning/disinfection process, 87.5, 79.4 and 87.5% of the surfaces were considered clean using the visual inspection, bioluminescence adenosine triphosphate and microbiological analyses, respectively. A statistically significant decrease was observed in the disapproval rates after the cleaning process considering the three assessment methods; the visual inspection was the least reliable. the cleaning/disinfection method was efficient in reducing microbial load and organic matter of surfaces, however, these findings require further study to clarify aspects related to the efficiency of friction, its frequency, and whether or not there is association with other inputs to achieve improved results of the cleaning/disinfection process.
Trends and associated uncertainty in the global mean temperature record
NASA Astrophysics Data System (ADS)
Poppick, A. N.; Moyer, E. J.; Stein, M.
2016-12-01
Physical models suggest that the Earth's mean temperature warms in response to changing CO2 concentrations (and hence increased radiative forcing); given physical uncertainties in this relationship, the historical temperature record is a source of empirical information about global warming. A persistent thread in many analyses of the historical temperature record, however, is the reliance on methods that appear to deemphasize both physical and statistical assumptions. Examples include regression models that treat time rather than radiative forcing as the relevant covariate, and time series methods that account for natural variability in nonparametric rather than parametric ways. We show here that methods that deemphasize assumptions can limit the scope of analysis and can lead to misleading inferences, particularly in the setting considered where the data record is relatively short and the scale of temporal correlation is relatively long. A proposed model that is simple but physically informed provides a more reliable estimate of trends and allows a broader array of questions to be addressed. In accounting for uncertainty, we also illustrate how parametric statistical models that are attuned to the important characteristics of natural variability can be more reliable than ostensibly more flexible approaches.
NASA Astrophysics Data System (ADS)
Nikolopoulos, E. I.; Destro, E.; Bhuiyan, M. A. E.; Borga, M., Sr.; Anagnostou, E. N.
2017-12-01
Fire disasters affect modern societies at global scale inducing significant economic losses and human casualties. In addition to their direct impacts they have various adverse effects on hydrologic and geomorphologic processes of a region due to the tremendous alteration of the landscape characteristics (vegetation, soil properties etc). As a consequence, wildfires often initiate a cascade of hazards such as flash floods and debris flows that usually follow the occurrence of a wildfire thus magnifying the overall impact in a region. Post-fire debris flows (PFDF) is one such type of hazards frequently occurring in Western United States where wildfires are a common natural disaster. Prediction of PDFD is therefore of high importance in this region and over the last years a number of efforts from United States Geological Survey (USGS) and National Weather Service (NWS) have been focused on the development of early warning systems that will help mitigate PFDF risk. This work proposes a prediction framework that is based on a nonparametric statistical technique (random forests) that allows predicting the occurrence of PFDF at regional scale with a higher degree of accuracy than the commonly used approaches that are based on power-law thresholds and logistic regression procedures. The work presented is based on a recently released database from USGS that reports a total of 1500 storms that triggered and did not trigger PFDF in a number of fire affected catchments in Western United States. The database includes information on storm characteristics (duration, accumulation, max intensity etc) and other auxiliary information of land surface properties (soil erodibility index, local slope etc). Results show that the proposed model is able to achieve a satisfactory prediction accuracy (threat score > 0.6) superior of previously published prediction frameworks highlighting the potential of nonparametric statistical techniques for development of PFDF prediction systems.
NASA Astrophysics Data System (ADS)
Jhajharia, Deepak; Yadav, Brijesh K.; Maske, Sunil; Chattopadhyay, Surajit; Kar, Anil K.
2012-01-01
Trends in rainfall, rainy days and 24 h maximum rainfall are investigated using the Mann-Kendall non-parametric test at twenty-four sites of subtropical Assam located in the northeastern region of India. The trends are statistically confirmed by both the parametric and non-parametric methods and the magnitudes of significant trends are obtained through the linear regression test. In Assam, the average monsoon rainfall (rainy days) during the monsoon months of June to September is about 1606 mm (70), which accounts for about 70% (64%) of the annual rainfall (rainy days). On monthly time scales, sixteen and seventeen sites (twenty-one sites each) witnessed decreasing trends in the total rainfall (rainy days), out of which one and three trends (seven trends each) were found to be statistically significant in June and July, respectively. On the other hand, seventeen sites witnessed increasing trends in rainfall in the month of September, but none were statistically significant. In December (February), eighteen (twenty-two) sites witnessed decreasing (increasing) trends in total rainfall, out of which five (three) trends were statistically significant. For the rainy days during the months of November to January, twenty-two or more sites witnessed decreasing trends in Assam, but for nine (November), twelve (January) and eighteen (December) sites, these trends were statistically significant. These observed changes in rainfall, although most time series are not convincing as they show predominantly no significance, along with the well-reported climatic warming in monsoon and post-monsoon seasons may have implications for human health and water resources management over bio-diversity rich Northeast India.
Common Scientific and Statistical Errors in Obesity Research
George, Brandon J.; Beasley, T. Mark; Brown, Andrew W.; Dawson, John; Dimova, Rositsa; Divers, Jasmin; Goldsby, TaShauna U.; Heo, Moonseong; Kaiser, Kathryn A.; Keith, Scott; Kim, Mimi Y.; Li, Peng; Mehta, Tapan; Oakes, J. Michael; Skinner, Asheley; Stuart, Elizabeth; Allison, David B.
2015-01-01
We identify 10 common errors and problems in the statistical analysis, design, interpretation, and reporting of obesity research and discuss how they can be avoided. The 10 topics are: 1) misinterpretation of statistical significance, 2) inappropriate testing against baseline values, 3) excessive and undisclosed multiple testing and “p-value hacking,” 4) mishandling of clustering in cluster randomized trials, 5) misconceptions about nonparametric tests, 6) mishandling of missing data, 7) miscalculation of effect sizes, 8) ignoring regression to the mean, 9) ignoring confirmation bias, and 10) insufficient statistical reporting. We hope that discussion of these errors can improve the quality of obesity research by helping researchers to implement proper statistical practice and to know when to seek the help of a statistician. PMID:27028280
Analysis of Anatomic and Functional Measures in X-Linked Retinoschisis
Cukras, Catherine A.; Huryn, Laryssa A.; Jeffrey, Brett P.; Turriff, Amy; Sieving, Paul A.
2018-01-01
Purpose To examine the symmetry of structural and functional parameters between eyes in patients with X-linked retinoschisis (XLRS), as well as changes in visual acuity and electrophysiology over time. Methods This is a single-center observational study of 120 males with XLRS who were evaluated at the National Eye Institute. Examinations included best-corrected visual acuity for all participants, as well as ERG recording and optical coherence tomography (OCT) on a subset of participants. Statistical analyses were performed using nonparametric Spearman correlations and linear regression. Results Our analyses demonstrated a statistically significant correlation of structural and functional measures between the two eyes of XLRS patients for all parameters. OCT central macular thickness (n = 78; Spearman r = 0.83, P < 0.0001) and ERG b/a ratio (n = 78; Spearman r = 0.82, P < 0.0001) were the most strongly correlated between a participant's eyes, whereas visual acuity was less strongly correlated (n = 120; Spearman r = 0.47, P < 0.0001). Stability of visual acuity was observed with an average change of less than one letter (n = 74; OD −0.66 and OS −0.70 letters) in a mean follow-up time of 6.8 years. There was no statistically significant change in the ERG b/a ratio within eyes over time. Conclusions Although a broad spectrum of clinical phenotypes is observed across individuals with XLRS, our study demonstrates a significant correlation of structural and functional findings between the two eyes and stability of measures of acuity and ERG parameters over time. These results highlight the utility of the fellow eye as a useful reference for monocular interventional trials.
Hydrological influences on the water quality trends in Tamiraparani Basin, South India.
Ravichandran, S
2003-09-01
Water quality variables--Turbidity, pH, Electrical Conductivity (EC), Chlorides and Total Hardness (TH) were monitored at a downstream location in the Tamiraparani River during 1978-1992. The observations were made at weekly intervals in a water treatment and supply plant using standard methods. Graphical and statistical analyses were used for data exploration, trend detection and assessment. Box-Whisker plots of annual and seasonal changes in variables indicated apparent trends being present in the data and their response to the seasonal influence of the monsoon rainfall. Further, the examination of the median values of the variables indicated that changes in the direction of trend occurred during 1985-1986, especially in pH, EC and TH. The statistical analyses were done using non-parametric methods, the ANCOVA on rank transformed data and the Seasonal Man-Kendall test. The presence of monotonic trend in all the water quality variables was confirmed, however, with independent direction of change. The trend line was fitted by the method of least squares. The estimated values indicated significant increases in EC (28 microS cm(-1)) while significant decreases were observed in turbidity (90 NTU), pH (0.78), and total hardness (23 ppm) in a span of 15 years. The changes induced in river flow by the addition of a stabilizing reservoir, the influence of seasonal and spatial pattern of monsoon rainfall across the river basin and the increased agriculture appear causative factors for the water quality trends seen in the Tamiraparani River system.
Identification and estimation of survivor average causal effects.
Tchetgen Tchetgen, Eric J
2014-09-20
In longitudinal studies, outcomes ascertained at follow-up are typically undefined for individuals who die prior to the follow-up visit. In such settings, outcomes are said to be truncated by death and inference about the effects of a point treatment or exposure, restricted to individuals alive at the follow-up visit, could be biased even if as in experimental studies, treatment assignment were randomized. To account for truncation by death, the survivor average causal effect (SACE) defines the effect of treatment on the outcome for the subset of individuals who would have survived regardless of exposure status. In this paper, the author nonparametrically identifies SACE by leveraging post-exposure longitudinal correlates of survival and outcome that may also mediate the exposure effects on survival and outcome. Nonparametric identification is achieved by supposing that the longitudinal data arise from a certain nonparametric structural equations model and by making the monotonicity assumption that the effect of exposure on survival agrees in its direction across individuals. A novel weighted analysis involving a consistent estimate of the survival process is shown to produce consistent estimates of SACE. A data illustration is given, and the methods are extended to the context of time-varying exposures. We discuss a sensitivity analysis framework that relaxes assumptions about independent errors in the nonparametric structural equations model and may be used to assess the extent to which inference may be altered by a violation of key identifying assumptions. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.
Identification and estimation of survivor average causal effects
Tchetgen, Eric J Tchetgen
2014-01-01
In longitudinal studies, outcomes ascertained at follow-up are typically undefined for individuals who die prior to the follow-up visit. In such settings, outcomes are said to be truncated by death and inference about the effects of a point treatment or exposure, restricted to individuals alive at the follow-up visit, could be biased even if as in experimental studies, treatment assignment were randomized. To account for truncation by death, the survivor average causal effect (SACE) defines the effect of treatment on the outcome for the subset of individuals who would have survived regardless of exposure status. In this paper, the author nonparametrically identifies SACE by leveraging post-exposure longitudinal correlates of survival and outcome that may also mediate the exposure effects on survival and outcome. Nonparametric identification is achieved by supposing that the longitudinal data arise from a certain nonparametric structural equations model and by making the monotonicity assumption that the effect of exposure on survival agrees in its direction across individuals. A novel weighted analysis involving a consistent estimate of the survival process is shown to produce consistent estimates of SACE. A data illustration is given, and the methods are extended to the context of time-varying exposures. We discuss a sensitivity analysis framework that relaxes assumptions about independent errors in the nonparametric structural equations model and may be used to assess the extent to which inference may be altered by a violation of key identifying assumptions. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24889022
Ferrarini, Luca; Veer, Ilya M; van Lew, Baldur; Oei, Nicole Y L; van Buchem, Mark A; Reiber, Johan H C; Rombouts, Serge A R B; Milles, J
2011-06-01
In recent years, graph theory has been successfully applied to study functional and anatomical connectivity networks in the human brain. Most of these networks have shown small-world topological characteristics: high efficiency in long distance communication between nodes, combined with highly interconnected local clusters of nodes. Moreover, functional studies performed at high resolutions have presented convincing evidence that resting-state functional connectivity networks exhibits (exponentially truncated) scale-free behavior. Such evidence, however, was mostly presented qualitatively, in terms of linear regressions of the degree distributions on log-log plots. Even when quantitative measures were given, these were usually limited to the r(2) correlation coefficient. However, the r(2) statistic is not an optimal estimator of explained variance, when dealing with (truncated) power-law models. Recent developments in statistics have introduced new non-parametric approaches, based on the Kolmogorov-Smirnov test, for the problem of model selection. In this work, we have built on this idea to statistically tackle the issue of model selection for the degree distribution of functional connectivity at rest. The analysis, performed at voxel level and in a subject-specific fashion, confirmed the superiority of a truncated power-law model, showing high consistency across subjects. Moreover, the most highly connected voxels were found to be consistently part of the default mode network. Our results provide statistically sound support to the evidence previously presented in literature for a truncated power-law model of resting-state functional connectivity. Copyright © 2010 Elsevier Inc. All rights reserved.
Zeng, Li-ping; Hu, Zheng-mao; Mu, Li-li; Mei, Gui-sen; Lu, Xiu-ling; Zheng, Yong-jun; Li, Pei-jian; Zhang, Ying-xue; Pan, Qian; Long, Zhi-gao; Dai, He-ping; Zhang, Zhuo-hua; Xia, Jia-hui; Zhao, Jing-ping; Xia, Kun
2011-06-01
To investigate the relationship of susceptibility loci in chromosomes 1q21-25 and 6p21-25 and schizophrenia subtypes in Chinese population. A genomic scan and parametric and non-parametric analyses were performed on 242 individuals from 36 schizophrenia pedigrees, including 19 paranoid schizophrenia and 17 undifferentiated schizophrenia pedigrees, from Henan province of China using 5 microsatellite markers in the chromosome region 1q21-25 and 8 microsatellite markers in the chromosome region 6p21-25, which were the candidates of previous studies. All affected subjects were diagnosed and typed according to the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revised (DSM-IV-TR; American Psychiatric Association, 2000). All subjects signed informed consent. In chromosome 1, parametric analysis under the dominant inheritance mode of all 36 pedigrees showed that the maximum multi-point heterogeneity Log of odds score method (HLOD) score was 1.33 (α = 0.38). The non-parametric analysis and the single point and multi-point nonparametric linkage (NPL) scores suggested linkage at D1S484, D1S2878, and D1S196. In the 19 paranoid schizophrenias pedigrees, linkage was not observed for any of the 5 markers. In the 17 undifferentiated schizophrenia pedigrees, the multi-point NPL score was 1.60 (P= 0.0367) at D1S484. The single point NPL score was 1.95(P= 0.0145) and the multi-point NPL score was 2.39 (P= 0.0041) at D1S2878. Additionally, the multi-point NPL score was 1.74 (P= 0.0255) at D1S196. These same three loci showed suggestive linkage during the integrative analysis of all 36 pedigrees. In chromosome 6, parametric linkage analysis under the dominant and recessive inheritance and the non-parametric linkage analysis of all 36 pedigrees and the 17 undifferentiated schizophrenia pedigrees, linkage was not observed for any of the 8 markers. In the 19 paranoid schizophrenias pedigrees, parametric analysis showed that under recessive inheritance mode the maximum single-point HLOD score was 1.26 (α = 0.40) and the multi-point HLOD was 1.12 (α = 0.38) at D6S289 in the chromosome 6p23. In nonparametric analysis, the single-point NPL score was 1.52 (P= 0.0402) and the multi-point NPL score was 1.92 (P= 0.0206) at D6S289. Susceptibility genes correlated with undifferentiated schizophrenia pedigrees from D1S484, D1S2878, D1S196 loci, and those correlated with paranoid schizophrenia pedigrees from D6S289 locus are likely present in chromosome regions 1q23.3 and 1q24.2, and chromosome region 6p23, respectively.
Using GIS to analyze animal movements in the marine environment
Hooge, Philip N.; Eichenlaub, William M.; Solomon, Elizabeth K.; Kruse, Gordon H.; Bez, Nicolas; Booth, Anthony; Dorn, Martin W.; Hills, Susan; Lipcius, Romuald N.; Pelletier, Dominique; Roy, Claude; Smith, Stephen J.; Witherell, David B.
2001-01-01
Advanced methods for analyzing animal movements have been little used in the aquatic research environment compared to the terrestrial. In addition, despite obvious advantages of integrating geographic information systems (GIS) with spatial studies of animal movement behavior, movement analysis tools have not been integrated into GIS for either aquatic or terrestrial environments. We therefore developed software that integrates one of the most commonly used GIS programs (ArcView®) with a large collection of animal movement analysis tools. This application, the Animal Movement Analyst Extension (AMAE), can be loaded as an extension to ArcView® under multiple operating system platforms (PC, Unix, and Mac OS). It contains more than 50 functions, including parametric and nonparametric home range analyses, random walk models, habitat analyses, point and circular statistics, tests of complete spatial randomness, tests for autocorrelation and sample size, point and line manipulation tools, and animation tools. This paper describes the use of these functions in analyzing animal location data; some limited examples are drawn from a sonic-tracking study of Pacific halibut (Hippoglossus stenolepis) in Glacier Bay, Alaska. The extension is available on the Internet at www.absc.usgs.gov/glba/gistools/index.htm.
Theory and Application of DNA Histogram Analysis.
ERIC Educational Resources Information Center
Bagwell, Charles Bruce
The underlying principles and assumptions associated with DNA histograms are discussed along with the characteristics of fluorescent probes. Information theory was described and used to calculate the information content of a DNA histogram. Two major types of DNA histogram analyses are proposed: parametric and nonparametric analysis. Three levels…
Radioactivity Registered With a Small Number of Events
NASA Astrophysics Data System (ADS)
Zlokazov, Victor; Utyonkov, Vladimir
2018-02-01
The synthesis of superheavy elements asks for the analysis of low statistics experimental data presumably obeying an unknown exponential distribution and to take the decision whether they originate from one source or have admixtures. Here we analyze predictions following from non-parametrical methods, employing only such fundamental sample properties as the sample mean, the median and the mode.
A Simple Effect Size Estimator for Single Case Designs Using WinBUGS
ERIC Educational Resources Information Center
Rindskopf, David; Shadish, William; Hedges, Larry
2012-01-01
Data from single case designs (SCDs) have traditionally been analyzed by visual inspection rather than statistical models. As a consequence, effect sizes have been of little interest. Lately, some effect-size estimators have been proposed, but most are either (i) nonparametric, and/or (ii) based on an analogy incompatible with effect sizes from…
Constraining geostatistical models with hydrological data to improve prediction realism
NASA Astrophysics Data System (ADS)
Demyanov, V.; Rojas, T.; Christie, M.; Arnold, D.
2012-04-01
Geostatistical models reproduce spatial correlation based on the available on site data and more general concepts about the modelled patters, e.g. training images. One of the problem of modelling natural systems with geostatistics is in maintaining realism spatial features and so they agree with the physical processes in nature. Tuning the model parameters to the data may lead to geostatistical realisations with unrealistic spatial patterns, which would still honour the data. Such model would result in poor predictions, even though although fit the available data well. Conditioning the model to a wider range of relevant data provide a remedy that avoid producing unrealistic features in spatial models. For instance, there are vast amounts of information about the geometries of river channels that can be used in describing fluvial environment. Relations between the geometrical channel characteristics (width, depth, wave length, amplitude, etc.) are complex and non-parametric and are exhibit a great deal of uncertainty, which is important to propagate rigorously into the predictive model. These relations can be described within a Bayesian approach as multi-dimensional prior probability distributions. We propose a way to constrain multi-point statistics models with intelligent priors obtained from analysing a vast collection of contemporary river patterns based on previously published works. We applied machine learning techniques, namely neural networks and support vector machines, to extract multivariate non-parametric relations between geometrical characteristics of fluvial channels from the available data. An example demonstrates how ensuring geological realism helps to deliver more reliable prediction of a subsurface oil reservoir in a fluvial depositional environment.
Apoptosis in subicular neurons: A comparison between suicide and Addison's disease
Printha, K.; Hulathduwa, S. R.; Samarasinghe, K.; Suh, Y. H.; De Silva, K. R. D.
2009-01-01
Background: Stress and depression shows possible links to neuronal death in hippocampus. Subiculum plays a prominent role in limbic stress integration and direct effect of corticosteroids on subicular neurons needs to be defined to assess its subsequent impact on hippocampal plasticity. Aim: This study was intended to assess apoptosis in subicular neurons of a young depressed suicide victim, where presumably stress induced excess of corticosteroids and a case of young Addison's disease with low level of corticosteroids. Materials and Method: Both bilateral adrenal glands (Addison's) and subiculum (both cases) were initially stained with hematoxylin and eosin; subicular neurons of both cases were examined for the degree of apoptosis using ‘ApopTag Kit’. Apoptotic cell counts were expressed as average number of labeled cells/mm2 and the results were analysed statistically using a non-parametric Mann–Whitney U test. Result: Apoptotic neurons were detected in the subicular region of both suicide and Addison victims, and it is statistically significant in both right and left between the cases (P < 0.05). In suicide victim, the neuronal apoptosis is considerably significant between the two hemispheres (P < 0.05), in contrast to Addison disease where the number of neuronal cell death between right and left was statistically insignificant (P > 0.05). Conclusion: The present study confirms the vulnerability of the subicular neurons to apoptosis, possibly due to corticosteroids in both ends of spectrum. PMID:20048453
Evaluating Cellular Polyfunctionality with a Novel Polyfunctionality Index
Larsen, Martin; Sauce, Delphine; Arnaud, Laurent; Fastenackels, Solène; Appay, Victor; Gorochov, Guy
2012-01-01
Functional evaluation of naturally occurring or vaccination-induced T cell responses in mice, men and monkeys has in recent years advanced from single-parameter (e.g. IFN-γ-secretion) to much more complex multidimensional measurements. Co-secretion of multiple functional molecules (such as cytokines and chemokines) at the single-cell level is now measurable due primarily to major advances in multiparametric flow cytometry. The very extensive and complex datasets generated by this technology raise the demand for proper analytical tools that enable the analysis of combinatorial functional properties of T cells, hence polyfunctionality. Presently, multidimensional functional measures are analysed either by evaluating all combinations of parameters individually or by summing frequencies of combinations that include the same number of simultaneous functions. Often these evaluations are visualized as pie charts. Whereas pie charts effectively represent and compare average polyfunctionality profiles of particular T cell subsets or patient groups, they do not document the degree or variation of polyfunctionality within a group nor does it allow more sophisticated statistical analysis. Here we propose a novel polyfunctionality index that numerically evaluates the degree and variation of polyfuntionality, and enable comparative and correlative parametric and non-parametric statistical tests. Moreover, it allows the usage of more advanced statistical approaches, such as cluster analysis. We believe that the polyfunctionality index will render polyfunctionality an appropriate end-point measure in future studies of T cell responsiveness. PMID:22860124
[Clinical profile of cytomegalovirus (CMV) enterocolitis in acquired immunodeficiency syndrome].
De Lima, D B; Fernandes, O; Gomes, V R; Da Silva, E J; De Pinho, P R; De Paiva, D D
2000-01-01
To determine the clinical profile of CMV colitis in AIDS patients, comparing clinical, endoscopic parameters and survival time between 2 groups of AIDS patients having chronic diarrhea. Group A being CMV colitis and group B without CMV colitis. 48 patients with diarrhea that lasted more than 30 days, being 27 in Group A and 21 in Group B, were studied. Age, risk factors, interval time between the diagnosis of HIV infection and the beginning of diarrhea, hematochesia, the endoscopic findings and life table in both groups, were analysed. All of them were diagnosed by stool culture and stools for ovum and parasites, along colonoscopy with biopsies. The unpaired t test was used to assess statistical significance of differences observed in the means of continuous and the chi-square with Yates correction for non-parametric variables. The survival curves were assessed by the Kaplan-Meier and the Mantel-Haenszel's tests. A P value of less than 0,05 was considered to indicate statistical significance. The mucosal lesions associated with the CMV infection are typically ulcerative on a background of hemorrhagic erythema 14 (51,8%) p < 0,01. The life table analysis disclosed shorter survival time in the CMV colitis group 0,005> P>0,001. The others studied data did not achieve statistical significance. AIDS patients with CMV colitis have a poorer long-term survival. Among the colonoscopic findings, ulcerations with hemorrhagic background were the most common lesions.
Friston, Karl J.; Bastos, André M.; Oswal, Ashwini; van Wijk, Bernadette; Richter, Craig; Litvak, Vladimir
2014-01-01
This technical paper offers a critical re-evaluation of (spectral) Granger causality measures in the analysis of biological timeseries. Using realistic (neural mass) models of coupled neuronal dynamics, we evaluate the robustness of parametric and nonparametric Granger causality. Starting from a broad class of generative (state-space) models of neuronal dynamics, we show how their Volterra kernels prescribe the second-order statistics of their response to random fluctuations; characterised in terms of cross-spectral density, cross-covariance, autoregressive coefficients and directed transfer functions. These quantities in turn specify Granger causality — providing a direct (analytic) link between the parameters of a generative model and the expected Granger causality. We use this link to show that Granger causality measures based upon autoregressive models can become unreliable when the underlying dynamics is dominated by slow (unstable) modes — as quantified by the principal Lyapunov exponent. However, nonparametric measures based on causal spectral factors are robust to dynamical instability. We then demonstrate how both parametric and nonparametric spectral causality measures can become unreliable in the presence of measurement noise. Finally, we show that this problem can be finessed by deriving spectral causality measures from Volterra kernels, estimated using dynamic causal modelling. PMID:25003817
2013-01-01
Background The theoretical basis of genome-wide association studies (GWAS) is statistical inference of linkage disequilibrium (LD) between any polymorphic marker and a putative disease locus. Most methods widely implemented for such analyses are vulnerable to several key demographic factors and deliver a poor statistical power for detecting genuine associations and also a high false positive rate. Here, we present a likelihood-based statistical approach that accounts properly for non-random nature of case–control samples in regard of genotypic distribution at the loci in populations under study and confers flexibility to test for genetic association in presence of different confounding factors such as population structure, non-randomness of samples etc. Results We implemented this novel method together with several popular methods in the literature of GWAS, to re-analyze recently published Parkinson’s disease (PD) case–control samples. The real data analysis and computer simulation show that the new method confers not only significantly improved statistical power for detecting the associations but also robustness to the difficulties stemmed from non-randomly sampling and genetic structures when compared to its rivals. In particular, the new method detected 44 significant SNPs within 25 chromosomal regions of size < 1 Mb but only 6 SNPs in two of these regions were previously detected by the trend test based methods. It discovered two SNPs located 1.18 Mb and 0.18 Mb from the PD candidates, FGF20 and PARK8, without invoking false positive risk. Conclusions We developed a novel likelihood-based method which provides adequate estimation of LD and other population model parameters by using case and control samples, the ease in integration of these samples from multiple genetically divergent populations and thus confers statistically robust and powerful analyses of GWAS. On basis of simulation studies and analysis of real datasets, we demonstrated significant improvement of the new method over the non-parametric trend test, which is the most popularly implemented in the literature of GWAS. PMID:23394771
Yang, Hyeri; Na, Jihye; Jang, Won-Hee; Jung, Mi-Sook; Jeon, Jun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Lim, Kyung-Min; Bae, SeungJin
2015-05-05
Mouse local lymph node assay (LLNA, OECD TG429) is an alternative test replacing conventional guinea pig tests (OECD TG406) for the skin sensitization test but the use of a radioisotopic agent, (3)H-thymidine, deters its active dissemination. New non-radioisotopic LLNA, LLNA:BrdU-FCM employs a non-radioisotopic analog, 5-bromo-2'-deoxyuridine (BrdU) and flow cytometry. For an analogous method, OECD TG429 performance standard (PS) advises that two reference compounds be tested repeatedly and ECt(threshold) values obtained must fall within acceptable ranges to prove within- and between-laboratory reproducibility. However, this criteria is somewhat arbitrary and sample size of ECt is less than 5, raising concerns about insufficient reliability. Here, we explored various statistical methods to evaluate the reproducibility of LLNA:BrdU-FCM with stimulation index (SI), the raw data for ECt calculation, produced from 3 laboratories. Descriptive statistics along with graphical representation of SI was presented. For inferential statistics, parametric and non-parametric methods were applied to test the reproducibility of SI of a concurrent positive control and the robustness of results were investigated. Descriptive statistics and graphical representation of SI alone could illustrate the within- and between-laboratory reproducibility. Inferential statistics employing parametric and nonparametric methods drew similar conclusion. While all labs passed within- and between-laboratory reproducibility criteria given by OECD TG429 PS based on ECt values, statistical evaluation based on SI values showed that only two labs succeeded in achieving within-laboratory reproducibility. For those two labs that satisfied the within-lab reproducibility, between-laboratory reproducibility could be also attained based on inferential as well as descriptive statistics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Analysis of spatial and temporal rainfall trends in Sicily during the 1921-2012 period
NASA Astrophysics Data System (ADS)
Liuzzo, Lorena; Bono, Enrico; Sammartano, Vincenzo; Freni, Gabriele
2016-10-01
Precipitation patterns worldwide are changing under the effects of global warming. The impacts of these changes could dramatically affect the hydrological cycle and, consequently, the availability of water resources. In order to improve the quality and reliability of forecasting models, it is important to analyse historical precipitation data to account for possible future changes. For these reasons, a large number of studies have recently been carried out with the aim of investigating the existence of statistically significant trends in precipitation at different spatial and temporal scales. In this paper, the existence of statistically significant trends in rainfall from observational datasets, which were measured by 245 rain gauges over Sicily (Italy) during the 1921-2012 period, was investigated. Annual, seasonal and monthly time series were examined using the Mann-Kendall non-parametric statistical test to detect statistically significant trends at local and regional scales, and their significance levels were assessed. Prior to the application of the Mann-Kendall test, the historical dataset was completed using a geostatistical spatial interpolation technique, the residual ordinary kriging, and then processed to remove the influence of serial correlation on the test results, applying the procedure of trend-free pre-whitening. Once the trends at each site were identified, the spatial patterns of the detected trends were examined using spatial interpolation techniques. Furthermore, focusing on the 30 years from 1981 to 2012, the trend analysis was repeated with the aim of detecting short-term trends or possible changes in the direction of the trends. Finally, the effect of climate change on the seasonal distribution of rainfall during the year was investigated by analysing the trend in the precipitation concentration index. The application of the Mann-Kendall test to the rainfall data provided evidence of a general decrease in precipitation in Sicily during the 1921-2012 period. Downward trends frequently occurred during the autumn and winter months. However, an increase in total annual precipitation was detected during the period from 1981 to 2012.
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].
Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel
2017-01-01
The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
NASA Astrophysics Data System (ADS)
Bugała, Artur; Bednarek, Karol; Kasprzyk, Leszek; Tomczewski, Andrzej
2017-10-01
The paper presents the most representative - from the three-year measurement time period - characteristics of daily and monthly electricity production from a photovoltaic conversion using modules installed in a fixed and 2-axis tracking construction. Results are presented for selected summer, autumn, spring and winter days. Analyzed measuring stand is located on the roof of the Faculty of Electrical Engineering Poznan University of Technology building. The basic parameters of the statistical analysis like mean value, standard deviation, skewness, kurtosis, median, range, or coefficient of variation were used. It was found that the asymmetry factor can be useful in the analysis of the daily electricity production from a photovoltaic conversion. In order to determine the repeatability of monthly electricity production, occurring between the summer, and summer and winter months, a non-parametric Mann-Whitney U test was used as a statistical solution. In order to analyze the repeatability of daily peak hours, describing the largest value of the hourly electricity production, a non-parametric Kruskal-Wallis test was applied as an extension of the Mann-Whitney U test. Based on the analysis of the electric energy distribution from a prepared monitoring system it was found that traditional forecasting methods of the electricity production from a photovoltaic conversion, like multiple regression models, should not be the preferred methods of the analysis.
Nonparametric identification of nonlinear dynamic systems using a synchronisation-based method
NASA Astrophysics Data System (ADS)
Kenderi, Gábor; Fidlin, Alexander
2014-12-01
The present study proposes an identification method for highly nonlinear mechanical systems that does not require a priori knowledge of the underlying nonlinearities to reconstruct arbitrary restoring force surfaces between degrees of freedom. This approach is based on the master-slave synchronisation between a dynamic model of the system as the slave and the real system as the master using measurements of the latter. As the model synchronises to the measurements, it becomes an observer of the real system. The optimal observer algorithm in a least-squares sense is given by the Kalman filter. Using the well-known state augmentation technique, the Kalman filter can be turned into a dual state and parameter estimator to identify parameters of a priori characterised nonlinearities. The paper proposes an extension of this technique towards nonparametric identification. A general system model is introduced by describing the restoring forces as bilateral spring-dampers with time-variant coefficients, which are estimated as augmented states. The estimation procedure is followed by an a posteriori statistical analysis to reconstruct noise-free restoring force characteristics using the estimated states and their estimated variances. Observability is provided using only one measured mechanical quantity per degree of freedom, which makes this approach less demanding in the number of necessary measurement signals compared with truly nonparametric solutions, which typically require displacement, velocity and acceleration signals. Additionally, due to the statistical rigour of the procedure, it successfully addresses signals corrupted by significant measurement noise. In the present paper, the method is described in detail, which is followed by numerical examples of one degree of freedom (1DoF) and 2DoF mechanical systems with strong nonlinearities of vibro-impact type to demonstrate the effectiveness of the proposed technique.
Exponential series approaches for nonparametric graphical models
NASA Astrophysics Data System (ADS)
Janofsky, Eric
Markov Random Fields (MRFs) or undirected graphical models are parsimonious representations of joint probability distributions. This thesis studies high-dimensional, continuous-valued pairwise Markov Random Fields. We are particularly interested in approximating pairwise densities whose logarithm belongs to a Sobolev space. For this problem we propose the method of exponential series which approximates the log density by a finite-dimensional exponential family with the number of sufficient statistics increasing with the sample size. We consider two approaches to estimating these models. The first is regularized maximum likelihood. This involves optimizing the sum of the log-likelihood of the data and a sparsity-inducing regularizer. We then propose a variational approximation to the likelihood based on tree-reweighted, nonparametric message passing. This approximation allows for upper bounds on risk estimates, leverages parallelization and is scalable to densities on hundreds of nodes. We show how the regularized variational MLE may be estimated using a proximal gradient algorithm. We then consider estimation using regularized score matching. This approach uses an alternative scoring rule to the log-likelihood, which obviates the need to compute the normalizing constant of the distribution. For general continuous-valued exponential families, we provide parameter and edge consistency results. As a special case we detail a new approach to sparse precision matrix estimation which has statistical performance competitive with the graphical lasso and computational performance competitive with the state-of-the-art glasso algorithm. We then describe results for model selection in the nonparametric pairwise model using exponential series. The regularized score matching problem is shown to be a convex program; we provide scalable algorithms based on consensus alternating direction method of multipliers (ADMM) and coordinate-wise descent. We use simulations to compare our method to others in the literature as well as the aforementioned TRW estimator.
An appraisal of statistical procedures used in derivation of reference intervals.
Ichihara, Kiyoshi; Boyd, James C
2010-11-01
When conducting studies to derive reference intervals (RIs), various statistical procedures are commonly applied at each step, from the planning stages to final computation of RIs. Determination of the necessary sample size is an important consideration, and evaluation of at least 400 individuals in each subgroup has been recommended to establish reliable common RIs in multicenter studies. Multiple regression analysis allows identification of the most important factors contributing to variation in test results, while accounting for possible confounding relationships among these factors. Of the various approaches proposed for judging the necessity of partitioning reference values, nested analysis of variance (ANOVA) is the likely method of choice owing to its ability to handle multiple groups and being able to adjust for multiple factors. Box-Cox power transformation often has been used to transform data to a Gaussian distribution for parametric computation of RIs. However, this transformation occasionally fails. Therefore, the non-parametric method based on determination of the 2.5 and 97.5 percentiles following sorting of the data, has been recommended for general use. The performance of the Box-Cox transformation can be improved by introducing an additional parameter representing the origin of transformation. In simulations, the confidence intervals (CIs) of reference limits (RLs) calculated by the parametric method were narrower than those calculated by the non-parametric approach. However, the margin of difference was rather small owing to additional variability in parametrically-determined RLs introduced by estimation of parameters for the Box-Cox transformation. The parametric calculation method may have an advantage over the non-parametric method in allowing identification and exclusion of extreme values during RI computation.
Korany, Mohamed A; Maher, Hadir M; Galal, Shereen M; Ragab, Marwa A A
2013-05-01
This manuscript discusses the application and the comparison between three statistical regression methods for handling data: parametric, nonparametric, and weighted regression (WR). These data were obtained from different chemometric methods applied to the high-performance liquid chromatography response data using the internal standard method. This was performed on a model drug Acyclovir which was analyzed in human plasma with the use of ganciclovir as internal standard. In vivo study was also performed. Derivative treatment of chromatographic response ratio data was followed by convolution of the resulting derivative curves using 8-points sin x i polynomials (discrete Fourier functions). This work studies and also compares the application of WR method and Theil's method, a nonparametric regression (NPR) method with the least squares parametric regression (LSPR) method, which is considered the de facto standard method used for regression. When the assumption of homoscedasticity is not met for analytical data, a simple and effective way to counteract the great influence of the high concentrations on the fitted regression line is to use WR method. WR was found to be superior to the method of LSPR as the former assumes that the y-direction error in the calibration curve will increase as x increases. Theil's NPR method was also found to be superior to the method of LSPR as the former assumes that errors could occur in both x- and y-directions and that might not be normally distributed. Most of the results showed a significant improvement in the precision and accuracy on applying WR and NPR methods relative to LSPR.
Davis, J.C.
2000-01-01
Geologists may feel that geological data are not amenable to statistical analysis, or at best require specialized approaches such as nonparametric statistics and geostatistics. However, there are many circumstances, particularly in systematic studies conducted for environmental or regulatory purposes, where traditional parametric statistical procedures can be beneficial. An example is the application of analysis of variance to data collected in an annual program of measuring groundwater levels in Kansas. Influences such as well conditions, operator effects, and use of the water can be assessed and wells that yield less reliable measurements can be identified. Such statistical studies have resulted in yearly improvements in the quality and reliability of the collected hydrologic data. Similar benefits may be achieved in other geological studies by the appropriate use of classical statistical tools.
Analyzing Single-Molecule Time Series via Nonparametric Bayesian Inference
Hines, Keegan E.; Bankston, John R.; Aldrich, Richard W.
2015-01-01
The ability to measure the properties of proteins at the single-molecule level offers an unparalleled glimpse into biological systems at the molecular scale. The interpretation of single-molecule time series has often been rooted in statistical mechanics and the theory of Markov processes. While existing analysis methods have been useful, they are not without significant limitations including problems of model selection and parameter nonidentifiability. To address these challenges, we introduce the use of nonparametric Bayesian inference for the analysis of single-molecule time series. These methods provide a flexible way to extract structure from data instead of assuming models beforehand. We demonstrate these methods with applications to several diverse settings in single-molecule biophysics. This approach provides a well-constrained and rigorously grounded method for determining the number of biophysical states underlying single-molecule data. PMID:25650922
Nonparametric estimation of benchmark doses in environmental risk assessment
Piegorsch, Walter W.; Xiong, Hui; Bhattacharya, Rabi N.; Lin, Lizhen
2013-01-01
Summary An important statistical objective in environmental risk analysis is estimation of minimum exposure levels, called benchmark doses (BMDs), that induce a pre-specified benchmark response in a dose-response experiment. In such settings, representations of the risk are traditionally based on a parametric dose-response model. It is a well-known concern, however, that if the chosen parametric form is misspecified, inaccurate and possibly unsafe low-dose inferences can result. We apply a nonparametric approach for calculating benchmark doses, based on an isotonic regression method for dose-response estimation with quantal-response data (Bhattacharya and Kong, 2007). We determine the large-sample properties of the estimator, develop bootstrap-based confidence limits on the BMDs, and explore the confidence limits’ small-sample properties via a short simulation study. An example from cancer risk assessment illustrates the calculations. PMID:23914133
Nonparametric spirometry reference values for Hispanic Americans.
Glenn, Nancy L; Brown, Vanessa M
2011-02-01
Recent literature sites ethnic origin as a major factor in developing pulmonary function reference values. Extensive studies established reference values for European and African Americans, but not for Hispanic Americans. The Third National Health and Nutrition Examination Survey defines Hispanic as individuals of Spanish speaking cultures. While no group was excluded from the target population, sample size requirements only allowed inclusion of individuals who identified themselves as Mexican Americans. This research constructs nonparametric reference value confidence intervals for Hispanic American pulmonary function. The method is applicable to all ethnicities. We use empirical likelihood confidence intervals to establish normal ranges for reference values. Its major advantage: it is model free, but shares asymptotic properties of model based methods. Statistical comparisons indicate that empirical likelihood interval lengths are comparable to normal theory intervals. Power and efficiency studies agree with previously published theoretical results.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cavanaugh, J.E.; McQuarrie, A.D.; Shumway, R.H.
Conventional methods for discriminating between earthquakes and explosions at regional distances have concentrated on extracting specific features such as amplitude and spectral ratios from the waveforms of the P and S phases. We consider here an optimum nonparametric classification procedure derived from the classical approach to discriminating between two Gaussian processes with unequal spectra. Two robust variations based on the minimum discrimination information statistic and Renyi's entropy are also considered. We compare the optimum classification procedure with various amplitude and spectral ratio discriminants and show that its performance is superior when applied to a small population of 8 land-based earthquakesmore » and 8 mining explosions recorded in Scandinavia. Several parametric characterizations of the notion of complexity based on modeling earthquakes and explosions as autoregressive or modulated autoregressive processes are also proposed and their performance compared with the nonparametric and feature extraction approaches.« less
Variability in clubhead presentation characteristics and ball impact location for golfers' drives.
Betzler, Nils F; Monk, Stuart A; Wallace, Eric S; Otto, Steve R
2012-01-01
The purpose of the present study was to analyse the variability in clubhead presentation to the ball and the resulting ball impact location on the club face for a range of golfers of different ability. A total of 285 male and female participants hit multiple shots using one of four proprietary drivers. Self-reported handicap was used to quantify a participant's golfing ability. A bespoke motion capture system and user-written algorithms was used to track the clubhead just before and at impact, measuring clubhead speed, clubhead orientation, and impact location. A Doppler radar was used to measure golf ball speed. Generally, golfers of higher skill (lower handicap) generated increased clubhead speed and increased efficiency (ratio of ball speed to clubhead speed). Non-parametric statistical tests showed that low-handicap golfers exhibit significantly lower variability from shot to shot in clubhead speed, efficiency, impact location, attack angle, club path, and face angle compared with high-handicap golfers.
Nursing advocacy in procedural pain care.
Vaartio, Heli; Leino-Kilpi, Helena; Suominen, Tarja; Puukka, Pauli
2009-05-01
In nursing, the concept of advocacy is often understood in terms of reactive or proactive action aimed at protecting patients' legal or moral rights. However, advocacy activities have not often been researched in the context of everyday clinical nursing practice, at least from patients' point of view. This study investigated the implementation of nursing advocacy in the context of procedural pain care from the perspectives of both patients and nurses. The cross-sectional study was conducted on a cluster sample of surgical otolaryngology patients (n = 405) and nurses (n = 118) from 12 hospital units in Finland. The data were obtained using an instrument specially designed for this purpose, and analysed statistically by descriptive and non-parametric methods. According to the results, patients and nurses have slightly different views about which dimensions of advocacy are implemented in procedural pain care. It seems that advocacy acts are chosen and implemented rather haphazardly, depending partly on how active patients are in expressing their wishes and interests and partly on nurses' empowerment.
BLIND EXTRACTION OF AN EXOPLANETARY SPECTRUM THROUGH INDEPENDENT COMPONENT ANALYSIS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Waldmann, I. P.; Tinetti, G.; Hollis, M. D. J.
2013-03-20
Blind-source separation techniques are used to extract the transmission spectrum of the hot-Jupiter HD189733b recorded by the Hubble/NICMOS instrument. Such a 'blind' analysis of the data is based on the concept of independent component analysis. The detrending of Hubble/NICMOS data using the sole assumption that nongaussian systematic noise is statistically independent from the desired light-curve signals is presented. By not assuming any prior or auxiliary information but the data themselves, it is shown that spectroscopic errors only about 10%-30% larger than parametric methods can be obtained for 11 spectral bins with bin sizes of {approx}0.09 {mu}m. This represents a reasonablemore » trade-off between a higher degree of objectivity for the non-parametric methods and smaller standard errors for the parametric de-trending. Results are discussed in light of previous analyses published in the literature. The fact that three very different analysis techniques yield comparable spectra is a strong indication of the stability of these results.« less
Multiple Hypothesis Testing for Experimental Gingivitis Based on Wilcoxon Signed Rank Statistics
Preisser, John S.; Sen, Pranab K.; Offenbacher, Steven
2011-01-01
Dental research often involves repeated multivariate outcomes on a small number of subjects for which there is interest in identifying outcomes that exhibit change in their levels over time as well as to characterize the nature of that change. In particular, periodontal research often involves the analysis of molecular mediators of inflammation for which multivariate parametric methods are highly sensitive to outliers and deviations from Gaussian assumptions. In such settings, nonparametric methods may be favored over parametric ones. Additionally, there is a need for statistical methods that control an overall error rate for multiple hypothesis testing. We review univariate and multivariate nonparametric hypothesis tests and apply them to longitudinal data to assess changes over time in 31 biomarkers measured from the gingival crevicular fluid in 22 subjects whereby gingivitis was induced by temporarily withholding tooth brushing. To identify biomarkers that can be induced to change, multivariate Wilcoxon signed rank tests for a set of four summary measures based upon area under the curve are applied for each biomarker and compared to their univariate counterparts. Multiple hypothesis testing methods with choice of control of the false discovery rate or strong control of the family-wise error rate are examined. PMID:21984957
Ruiz-Sanchez, Eduardo
2015-12-01
The Neotropical woody bamboo genus Otatea is one of five genera in the subtribe Guaduinae. Of the eight described Otatea species, seven are endemic to Mexico and one is also distributed in Central and South America. Otatea acuminata has the widest geographical distribution of the eight species, and two of its recently collected populations do not match the known species morphologically. Parametric and non-parametric methods were used to delimit the species in Otatea using five chloroplast markers, one nuclear marker, and morphological characters. The parametric coalescent method and the non-parametric analysis supported the recognition of two distinct evolutionary lineages. Molecular clock estimates were used to estimate divergence times in Otatea. The results for divergence time in Otatea estimated the origin of the speciation events from the Late Miocene to Late Pleistocene. The species delimitation analyses (parametric and non-parametric) identified that the two populations of O. acuminata from Chiapas and Hidalgo are from two separate evolutionary lineages and these new species have morphological characters that separate them from O. acuminata s.s. The geological activity of the Trans-Mexican Volcanic Belt and the Isthmus of Tehuantepec may have isolated populations and limited the gene flow between Otatea species, driving speciation. Based on the results found here, I describe Otatea rzedowskiorum and Otatea victoriae as two new species, morphologically different from O. acuminata. Copyright © 2015 Elsevier Inc. All rights reserved.
Mura, Maria Chiara; De Felice, Marco; Morlino, Roberta; Fuselli, Sergio
2010-01-01
In step with the need to develop statistical procedures to manage small-size environmental samples, in this work we have used concentration values of benzene (C6H6), concurrently detected by seven outdoor and indoor monitoring stations over 12 000 minutes, in order to assess the representativeness of collected data and the impact of the pollutant on indoor environment. Clearly, the former issue is strictly connected to sampling-site geometry, which proves critical to correctly retrieving information from analysis of pollutants of sanitary interest. Therefore, according to current criteria for network-planning, single stations have been interpreted as nodes of a set of adjoining triangles; then, a) node pairs have been taken into account in order to estimate pollutant stationarity on triangle sides, as well as b) node triplets, to statistically associate data from air-monitoring with the corresponding territory area, and c) node sextuplets, to assess the impact probability of the outdoor pollutant on indoor environment for each area. Distributions from the various node combinations are all non-Gaussian, in the consequently, Kruskal-Wallis (KW) non-parametric statistics has been exploited to test variability on continuous density function from each pair, triplet and sextuplet. Results from the above-mentioned statistical analysis have shown randomness of site selection, which has not allowed a reliable generalization of monitoring data to the entire selected territory, except for a single "forced" case (70%); most important, they suggest a possible procedure to optimize network design.
Gender Wage Disparities among the Highly Educated
ERIC Educational Resources Information Center
Black, Dan A.; Haviland, Amelia M.; Sanders, Seth G.; Taylor, Lowell J.
2008-01-01
We examine gender wage disparities for four groups of college-educated women--black, Hispanic, Asian, and non-Hispanic white--using the National Survey of College Graduates. Raw log wage gaps, relative to non-Hispanic white male counterparts, generally exceed -0.30. Estimated gaps decline to between -0.08 and -0.19 in nonparametric analyses that…
Granato, Gregory E.
2006-01-01
The Kendall-Theil Robust Line software (KTRLine-version 1.0) is a Visual Basic program that may be used with the Microsoft Windows operating system to calculate parameters for robust, nonparametric estimates of linear-regression coefficients between two continuous variables. The KTRLine software was developed by the U.S. Geological Survey, in cooperation with the Federal Highway Administration, for use in stochastic data modeling with local, regional, and national hydrologic data sets to develop planning-level estimates of potential effects of highway runoff on the quality of receiving waters. The Kendall-Theil robust line was selected because this robust nonparametric method is resistant to the effects of outliers and nonnormality in residuals that commonly characterize hydrologic data sets. The slope of the line is calculated as the median of all possible pairwise slopes between points. The intercept is calculated so that the line will run through the median of input data. A single-line model or a multisegment model may be specified. The program was developed to provide regression equations with an error component for stochastic data generation because nonparametric multisegment regression tools are not available with the software that is commonly used to develop regression models. The Kendall-Theil robust line is a median line and, therefore, may underestimate total mass, volume, or loads unless the error component or a bias correction factor is incorporated into the estimate. Regression statistics such as the median error, the median absolute deviation, the prediction error sum of squares, the root mean square error, the confidence interval for the slope, and the bias correction factor for median estimates are calculated by use of nonparametric methods. These statistics, however, may be used to formulate estimates of mass, volume, or total loads. The program is used to read a two- or three-column tab-delimited input file with variable names in the first row and data in subsequent rows. The user may choose the columns that contain the independent (X) and dependent (Y) variable. A third column, if present, may contain metadata such as the sample-collection location and date. The program screens the input files and plots the data. The KTRLine software is a graphical tool that facilitates development of regression models by use of graphs of the regression line with data, the regression residuals (with X or Y), and percentile plots of the cumulative frequency of the X variable, Y variable, and the regression residuals. The user may individually transform the independent and dependent variables to reduce heteroscedasticity and to linearize data. The program plots the data and the regression line. The program also prints model specifications and regression statistics to the screen. The user may save and print the regression results. The program can accept data sets that contain up to about 15,000 XY data points, but because the program must sort the array of all pairwise slopes, the program may be perceptibly slow with data sets that contain more than about 1,000 points.
Genetic Algorithm Based Framework for Automation of Stochastic Modeling of Multi-Season Streamflows
NASA Astrophysics Data System (ADS)
Srivastav, R. K.; Srinivasan, K.; Sudheer, K.
2009-05-01
Synthetic streamflow data generation involves the synthesis of likely streamflow patterns that are statistically indistinguishable from the observed streamflow data. The various kinds of stochastic models adopted for multi-season streamflow generation in hydrology are: i) parametric models which hypothesize the form of the periodic dependence structure and the distributional form a priori (examples are PAR, PARMA); disaggregation models that aim to preserve the correlation structure at the periodic level and the aggregated annual level; ii) Nonparametric models (examples are bootstrap/kernel based methods), which characterize the laws of chance, describing the stream flow process, without recourse to prior assumptions as to the form or structure of these laws; (k-nearest neighbor (k-NN), matched block bootstrap (MABB)); non-parametric disaggregation model. iii) Hybrid models which blend both parametric and non-parametric models advantageously to model the streamflows effectively. Despite many of these developments that have taken place in the field of stochastic modeling of streamflows over the last four decades, accurate prediction of the storage and the critical drought characteristics has been posing a persistent challenge to the stochastic modeler. This is partly because, usually, the stochastic streamflow model parameters are estimated by minimizing a statistically based objective function (such as maximum likelihood (MLE) or least squares (LS) estimation) and subsequently the efficacy of the models is being validated based on the accuracy of prediction of the estimates of the water-use characteristics, which requires large number of trial simulations and inspection of many plots and tables. Still accurate prediction of the storage and the critical drought characteristics may not be ensured. In this study a multi-objective optimization framework is proposed to find the optimal hybrid model (blend of a simple parametric model, PAR(1) model and matched block bootstrap (MABB) ) based on the explicit objective functions of minimizing the relative bias and relative root mean square error in estimating the storage capacity of the reservoir. The optimal parameter set of the hybrid model is obtained based on the search over a multi- dimensional parameter space (involving simultaneous exploration of the parametric (PAR(1)) as well as the non-parametric (MABB) components). This is achieved using the efficient evolutionary search based optimization tool namely, non-dominated sorting genetic algorithm - II (NSGA-II). This approach helps in reducing the drudgery involved in the process of manual selection of the hybrid model, in addition to predicting the basic summary statistics dependence structure, marginal distribution and water-use characteristics accurately. The proposed optimization framework is used to model the multi-season streamflows of River Beaver and River Weber of USA. In case of both the rivers, the proposed GA-based hybrid model yields a much better prediction of the storage capacity (where simultaneous exploration of both parametric and non-parametric components is done) when compared with the MLE-based hybrid models (where the hybrid model selection is done in two stages, thus probably resulting in a sub-optimal model). This framework can be further extended to include different linear/non-linear hybrid stochastic models at other temporal and spatial scales as well.
Crainiceanu, Ciprian M.; Caffo, Brian S.; Di, Chong-Zhi; Punjabi, Naresh M.
2009-01-01
We introduce methods for signal and associated variability estimation based on hierarchical nonparametric smoothing with application to the Sleep Heart Health Study (SHHS). SHHS is the largest electroencephalographic (EEG) collection of sleep-related data, which contains, at each visit, two quasi-continuous EEG signals for each subject. The signal features extracted from EEG data are then used in second level analyses to investigate the relation between health, behavioral, or biometric outcomes and sleep. Using subject specific signals estimated with known variability in a second level regression becomes a nonstandard measurement error problem. We propose and implement methods that take into account cross-sectional and longitudinal measurement error. The research presented here forms the basis for EEG signal processing for the SHHS. PMID:20057925
Estimating and comparing microbial diversity in the presence of sequencing errors
Chiu, Chun-Huo
2016-01-01
Estimating and comparing microbial diversity are statistically challenging due to limited sampling and possible sequencing errors for low-frequency counts, producing spurious singletons. The inflated singleton count seriously affects statistical analysis and inferences about microbial diversity. Previous statistical approaches to tackle the sequencing errors generally require different parametric assumptions about the sampling model or about the functional form of frequency counts. Different parametric assumptions may lead to drastically different diversity estimates. We focus on nonparametric methods which are universally valid for all parametric assumptions and can be used to compare diversity across communities. We develop here a nonparametric estimator of the true singleton count to replace the spurious singleton count in all methods/approaches. Our estimator of the true singleton count is in terms of the frequency counts of doubletons, tripletons and quadrupletons, provided these three frequency counts are reliable. To quantify microbial alpha diversity for an individual community, we adopt the measure of Hill numbers (effective number of taxa) under a nonparametric framework. Hill numbers, parameterized by an order q that determines the measures’ emphasis on rare or common species, include taxa richness (q = 0), Shannon diversity (q = 1, the exponential of Shannon entropy), and Simpson diversity (q = 2, the inverse of Simpson index). A diversity profile which depicts the Hill number as a function of order q conveys all information contained in a taxa abundance distribution. Based on the estimated singleton count and the original non-singleton frequency counts, two statistical approaches (non-asymptotic and asymptotic) are developed to compare microbial diversity for multiple communities. (1) A non-asymptotic approach refers to the comparison of estimated diversities of standardized samples with a common finite sample size or sample completeness. This approach aims to compare diversity estimates for equally-large or equally-complete samples; it is based on the seamless rarefaction and extrapolation sampling curves of Hill numbers, specifically for q = 0, 1 and 2. (2) An asymptotic approach refers to the comparison of the estimated asymptotic diversity profiles. That is, this approach compares the estimated profiles for complete samples or samples whose size tends to be sufficiently large. It is based on statistical estimation of the true Hill number of any order q ≥ 0. In the two approaches, replacing the spurious singleton count by our estimated count, we can greatly remove the positive biases associated with diversity estimates due to spurious singletons and also make fair comparisons across microbial communities, as illustrated in our simulation results and in applying our method to analyze sequencing data from viral metagenomes. PMID:26855872
Barish, Syndi; Ochs, Michael F.; Sontag, Eduardo D.; Gevertz, Jana L.
2017-01-01
Cancer is a highly heterogeneous disease, exhibiting spatial and temporal variations that pose challenges for designing robust therapies. Here, we propose the VEPART (Virtual Expansion of Populations for Analyzing Robustness of Therapies) technique as a platform that integrates experimental data, mathematical modeling, and statistical analyses for identifying robust optimal treatment protocols. VEPART begins with time course experimental data for a sample population, and a mathematical model fit to aggregate data from that sample population. Using nonparametric statistics, the sample population is amplified and used to create a large number of virtual populations. At the final step of VEPART, robustness is assessed by identifying and analyzing the optimal therapy (perhaps restricted to a set of clinically realizable protocols) across each virtual population. As proof of concept, we have applied the VEPART method to study the robustness of treatment response in a mouse model of melanoma subject to treatment with immunostimulatory oncolytic viruses and dendritic cell vaccines. Our analysis (i) showed that every scheduling variant of the experimentally used treatment protocol is fragile (nonrobust) and (ii) discovered an alternative region of dosing space (lower oncolytic virus dose, higher dendritic cell dose) for which a robust optimal protocol exists. PMID:28716945
Malliou, P; Rokka, S; Beneka, A; Gioftsidou, A; Mavromoustakos, S; Godolias, G
2014-01-01
There is limited information on injury patterns in Step Aerobic Instructors (SAI) who exclusively execute "step" aerobic classes. To record the type and the anatomical position in relation to diagnosis of muscular skeletal injuries in step aerobic instructors. Also, to analyse the days of absence due to chronic injury in relation to weekly working hours, height of the step platform, working experience and working surface and footwear during the step class. The Step Aerobic Instructors Injuries Questionnaire was developed, and then validity and reliability indices were calculated. 63 SAI completed the questionnaire. For the statistical analysis of the data, the method used was the analysis of frequencies, the non-parametric test χ
Cervical shaping in curved root canals: comparison of the efficiency of two endodontic instruments.
Busquim, Sandra Soares Kühne; dos Santos, Marcelo
2002-01-01
The aim of this study was to determine the removal of dentin produced by number 25 (0.08) Flare files (Quantec Flare Series, Analytic Endodontics, Glendora, California, USA) and number 1 e 2 Gates-Glidden burs (Dentsply - Maillefer, Ballaigues, Switzerland), in the mesio-buccal and mesio-lingual root canals, respectively, of extracted human permanent inferior molars, by means of measuring the width of dentinal walls prior and after instrumentation. The obtained values were compared. Due to the multiple analyses of data, a nonparametric test was used, and the Kruskal-Wallis test was chosen. There was no significant difference between the instruments as to the removal of dentin in the 1st and 2nd millimeters. However, when comparing the performances of the instruments in the 3rd millimeter, Flare files promoted a greater removal than Gates-Glidden drills (p > 0.05). The analysis revealed no significant differences as to mesial wear, which demonstrates the similar behavior of both instruments. Gates-Glidden drills produced an expressive mesial detour in the 2nd and 3rd millimeters, which was detected trough a statistically significant difference in the wear of this region (p > 0.05). There was no statistically significant difference between mesial and lateral wear when Flare instruments were employed.
Daylight exposure and the other predictors of burnout among nurses in a University Hospital.
Alimoglu, Mustafa Kemal; Donmez, Levent
2005-07-01
The purpose of the study was to investigate if daylight exposure in work setting could be placed among the predictors of job burnout. The sample was composed of 141 nurses who work in Akdeniz University Hospital in Antalya, Turkey. All participants were asked to complete a personal data collection form, the Maslach Burnout Inventory, the Work Related Strain Inventory and the Work Satisfaction Questionnaire to collect data about their burnout, work-related stress (WRS) and job satisfaction (JS) levels in addition to personal characteristics. Descriptive statistics, parametric and non-parametric tests and correlation analysis were used in statistical analyses. Daylight exposure showed no direct effect on burnout but it was indirectly effective via WRS and JS. Exposure to daylight at least 3h a day was found to cause less stress and higher satisfaction at work. Suffering from sleep disorders, younger age, job-related health problems and educational level were found to have total or partial direct effects on burnout. Night shifts may lead to burnout via work related strain and working in inpatient services and dissatisfaction with annual income may be effective via job dissatisfaction. This study confirmed some established predictors of burnout and provided data on an unexplored area. Daylight exposure may be effective on job burnout.
Erol, Ozgul; Can, Gulbeyaz; Aydıner, Adnan
2012-10-01
The aim of this study was to find out the effects of chemotherapy-related alopecia on body image and quality of life of Turkish women who have cancer with or without headscarves and factors affecting them. This descriptive study was conducted with 204 women who received chemotherapy at the Istanbul University Institute of Oncology, Turkey. The Patient Description Form, Body Image Scale and Nightingale Symptom Assessment Scale were used in data collection. Statistical analyses were performed using descriptive statistics and non-parametric tests. Logistic regression analysis was done to predict the factors affecting body image and quality of life of the patients. No difference was found between women wearing headscarves and those who did not in respect of their body image. However, women who wore headscarves who had no alopecia felt less dissatisfied with their scars, and women not wearing headscarves who had no alopecia have been feeling less self-conscious, less dissatisfied with their appearance. There was difference in terms of quality of life: women wearing headscarves had worse physical, psychological and general well-being than others. Although there were many important factors, multivariate analysis showed that for body image, having alopecia and wearing headscarves; and for quality of life, having alopecia were the variables that had considerable effects.
Impact of tamsulosin and nifedipine on contractility of pregnant rat ureters in vitro.
Haddad, Lisette; Corriveau, Stéphanie; Rousseau, Eric; Blouin, Simon; Pasquier, Jean-Charles; Ponsot, Yves; Roy-Lacroix, Marie-Ève
2018-01-01
To evaluate the in vitro effect of tamsulosin and nifedipine on the contractility of pregnant rat ureters and to perform quantitative analysis of the pharmacological effects. Medical expulsive therapy (MET) is commonly used to treat urolithiasis. However, this treatment is seldom used in pregnant women since no studies support this practice. This was an in vitro study on animal tissue derived from pregnant Sprague-Dawley rats. A total of 124 ureteral segments were mounted in an organ bath system and contractile response to methacholine (MCh) was assessed. Tamsulosin or nifedipine were added at cumulative concentrations (0.001-1 μM). The area under the curve (AUC) from isometric tension measurements was calculated. The effect of pharmacological agents and the respective controls were assessed by calculating the AUC for each 5-min interval. Statistical analyses were performed using the Mann-Whitney-Wilcoxon nonparametric test. Both drugs displayed statistically significant inhibitory activity at concentrations of 0.1 and 1 μM for tamsulosin and 1 μM for nifedipine when calculated as the AUC as compared to DMSO controls. Tamsulosin and nifedipine directly inhibit MCh-induced contractility of pregnant rat ureters. Further work is needed to determine the clinical efficacy of these medications for MET in pregnancy.
Building integral projection models: a user's guide
Rees, Mark; Childs, Dylan Z; Ellner, Stephen P; Coulson, Tim
2014-01-01
In order to understand how changes in individual performance (growth, survival or reproduction) influence population dynamics and evolution, ecologists are increasingly using parameterized mathematical models. For continuously structured populations, where some continuous measure of individual state influences growth, survival or reproduction, integral projection models (IPMs) are commonly used. We provide a detailed description of the steps involved in constructing an IPM, explaining how to: (i) translate your study system into an IPM; (ii) implement your IPM; and (iii) diagnose potential problems with your IPM. We emphasize how the study organism's life cycle, and the timing of censuses, together determine the structure of the IPM kernel and important aspects of the statistical analysis used to parameterize an IPM using data on marked individuals. An IPM based on population studies of Soay sheep is used to illustrate the complete process of constructing, implementing and evaluating an IPM fitted to sample data. We then look at very general approaches to parameterizing an IPM, using a wide range of statistical techniques (e.g. maximum likelihood methods, generalized additive models, nonparametric kernel density estimators). Methods for selecting models for parameterizing IPMs are briefly discussed. We conclude with key recommendations and a brief overview of applications that extend the basic model. The online Supporting Information provides commented R code for all our analyses. PMID:24219157
da Silva, Luiz Bueno; Coutinho, Antonio Souto; da Costa Eulálio, Eliza Juliana; Soares, Elaine Victor Gonçalves
2012-01-01
The main objective of this study is to evaluate the impact of school furniture and work surface lighting on the body posture of public Middle School students from Paraíba (Brazil). The survey was carried out in two public schools and the target population for the study included 8th grade groups involving a total of 31 students. Brazilian standards for lighting levels, the CEBRACE standards for furniture measurements and the Postural Assessment Software (SAPO) for the postural misalignment assay were adopted for the measurements comparison. The statistic analysis includes analyses of parametric and non-parametric correlations. The results show that the students' most affected parts of the body were the spine, the regions of the knees and head and neck, with 90% of the total number of students presenting postural misalignment. The lighting levels were usually found below 300 lux, below recommended levels. The statistic analysis show that the more adequate the furniture seems to be to the user, the less the user will complain of pain. Such results indicate the need of investments in more suitable school furniture and structural reforms aimed at improving the lighting in the classrooms, which could fulfill the students' profile and reduce their complaints.
Testing independence of bivariate interval-censored data using modified Kendall's tau statistic.
Kim, Yuneung; Lim, Johan; Park, DoHwan
2015-11-01
In this paper, we study a nonparametric procedure to test independence of bivariate interval censored data; for both current status data (case 1 interval-censored data) and case 2 interval-censored data. To do it, we propose a score-based modification of the Kendall's tau statistic for bivariate interval-censored data. Our modification defines the Kendall's tau statistic with expected numbers of concordant and disconcordant pairs of data. The performance of the modified approach is illustrated by simulation studies and application to the AIDS study. We compare our method to alternative approaches such as the two-stage estimation method by Sun et al. (Scandinavian Journal of Statistics, 2006) and the multiple imputation method by Betensky and Finkelstein (Statistics in Medicine, 1999b). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Harlander, Niklas; Rosenkranz, Tobias; Hohmann, Volker
2012-08-01
Single channel noise reduction has been well investigated and seems to have reached its limits in terms of speech intelligibility improvement, however, the quality of such schemes can still be advanced. This study tests to what extent novel model-based processing schemes might improve performance in particular for non-stationary noise conditions. Two prototype model-based algorithms, a speech-model-based, and a auditory-model-based algorithm were compared to a state-of-the-art non-parametric minimum statistics algorithm. A speech intelligibility test, preference rating, and listening effort scaling were performed. Additionally, three objective quality measures for the signal, background, and overall distortions were applied. For a better comparison of all algorithms, particular attention was given to the usage of the similar Wiener-based gain rule. The perceptual investigation was performed with fourteen hearing-impaired subjects. The results revealed that the non-parametric algorithm and the auditory model-based algorithm did not affect speech intelligibility, whereas the speech-model-based algorithm slightly decreased intelligibility. In terms of subjective quality, both model-based algorithms perform better than the unprocessed condition and the reference in particular for highly non-stationary noise environments. Data support the hypothesis that model-based algorithms are promising for improving performance in non-stationary noise conditions.
Assaad, Houssein I; Choudhary, Pankaj K
2013-01-01
The L -statistics form an important class of estimators in nonparametric statistics. Its members include trimmed means and sample quantiles and functions thereof. This article is devoted to theory and applications of L -statistics for repeated measurements data, wherein the measurements on the same subject are dependent and the measurements from different subjects are independent. This article has three main goals: (a) Show that the L -statistics are asymptotically normal for repeated measurements data. (b) Present three statistical applications of this result, namely, location estimation using trimmed means, quantile estimation and construction of tolerance intervals. (c) Obtain a Bahadur representation for sample quantiles. These results are generalizations of similar results for independently and identically distributed data. The practical usefulness of these results is illustrated by analyzing a real data set involving measurement of systolic blood pressure. The properties of the proposed point and interval estimators are examined via simulation.
Problems of allometric scaling analysis: examples from mammalian reproductive biology.
Martin, Robert D; Genoud, Michel; Hemelrijk, Charlotte K
2005-05-01
Biological scaling analyses employing the widely used bivariate allometric model are beset by at least four interacting problems: (1) choice of an appropriate best-fit line with due attention to the influence of outliers; (2) objective recognition of divergent subsets in the data (allometric grades); (3) potential restrictions on statistical independence resulting from phylogenetic inertia; and (4) the need for extreme caution in inferring causation from correlation. A new non-parametric line-fitting technique has been developed that eliminates requirements for normality of distribution, greatly reduces the influence of outliers and permits objective recognition of grade shifts in substantial datasets. This technique is applied in scaling analyses of mammalian gestation periods and of neonatal body mass in primates. These analyses feed into a re-examination, conducted with partial correlation analysis, of the maternal energy hypothesis relating to mammalian brain evolution, which suggests links between body size and brain size in neonates and adults, gestation period and basal metabolic rate. Much has been made of the potential problem of phylogenetic inertia as a confounding factor in scaling analyses. However, this problem may be less severe than suspected earlier because nested analyses of variance conducted on residual variation (rather than on raw values) reveals that there is considerable variance at low taxonomic levels. In fact, limited divergence in body size between closely related species is one of the prime examples of phylogenetic inertia. One common approach to eliminating perceived problems of phylogenetic inertia in allometric analyses has been calculation of 'independent contrast values'. It is demonstrated that the reasoning behind this approach is flawed in several ways. Calculation of contrast values for closely related species of similar body size is, in fact, highly questionable, particularly when there are major deviations from the best-fit line for the scaling relationship under scrutiny.
A Method for Assessing Change in Attitude: The McNemar Test.
ERIC Educational Resources Information Center
Ciechalski, Joseph C.; Pinkney, James W.; Weaver, Florence S.
This paper illustrates the use of the McNemar Test, using a hypothetical problem. The McNemar Test is a nonparametric statistical test that is a type of chi square test using dependent, rather than independent, samples to assess before-after designs in which each subject is used as his or her own control. Results of the McNemar test make it…
Nonparametric Conditional Estimation
1987-02-01
the data because the statistician has complete control over the method. It is especially reasonable when there is a bone fide loss function to which...For example, the sample mean is m(Fn). Most calculations that statisticians perform on a set of data can be expressed as statistical functionals on...of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering
10th Conference on Bayesian Nonparametrics
2016-05-08
RETURN YOUR FORM TO THE ABOVE ADDRESS. North Carolina State University 2701 Sullivan Drive Admin Srvcs III, Box 7514 Raleigh, NC 27695 -7514 ABSTRACT...the conference. The findings from the conference is widely disseminated. The conference web site displays slides of the talks presented in the...being published by the Electronic Journal of Statistics consisting of about 20 papers read at the conference. The conference web site displays
Shi, Yang; Chinnaiyan, Arul M; Jiang, Hui
2015-07-01
High-throughput sequencing of transcriptomes (RNA-Seq) has become a powerful tool to study gene expression. Here we present an R package, rSeqNP, which implements a non-parametric approach to test for differential expression and splicing from RNA-Seq data. rSeqNP uses permutation tests to access statistical significance and can be applied to a variety of experimental designs. By combining information across isoforms, rSeqNP is able to detect more differentially expressed or spliced genes from RNA-Seq data. The R package with its source code and documentation are freely available at http://www-personal.umich.edu/∼jianghui/rseqnp/. jianghui@umich.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Assessing T cell clonal size distribution: a non-parametric approach.
Bolkhovskaya, Olesya V; Zorin, Daniil Yu; Ivanchenko, Mikhail V
2014-01-01
Clonal structure of the human peripheral T-cell repertoire is shaped by a number of homeostatic mechanisms, including antigen presentation, cytokine and cell regulation. Its accurate tuning leads to a remarkable ability to combat pathogens in all their variety, while systemic failures may lead to severe consequences like autoimmune diseases. Here we develop and make use of a non-parametric statistical approach to assess T cell clonal size distributions from recent next generation sequencing data. For 41 healthy individuals and a patient with ankylosing spondylitis, who undergone treatment, we invariably find power law scaling over several decades and for the first time calculate quantitatively meaningful values of decay exponent. It has proved to be much the same among healthy donors, significantly different for an autoimmune patient before the therapy, and converging towards a typical value afterwards. We discuss implications of the findings for theoretical understanding and mathematical modeling of adaptive immunity.
Using exogenous variables in testing for monotonic trends in hydrologic time series
Alley, William M.
1988-01-01
One approach that has been used in performing a nonparametric test for monotonic trend in a hydrologic time series consists of a two-stage analysis. First, a regression equation is estimated for the variable being tested as a function of an exogenous variable. A nonparametric trend test such as the Kendall test is then performed on the residuals from the equation. By analogy to stagewise regression and through Monte Carlo experiments, it is demonstrated that this approach will tend to underestimate the magnitude of the trend and to result in some loss in power as a result of ignoring the interaction between the exogenous variable and time. An alternative approach, referred to as the adjusted variable Kendall test, is demonstrated to generally have increased statistical power and to provide more reliable estimates of the trend slope. In addition, the utility of including an exogenous variable in a trend test is examined under selected conditions.
Verbruggen, Heroen; Maggs, Christine A; Saunders, Gary W; Le Gall, Line; Yoon, Hwan Su; De Clerck, Olivier
2010-01-20
The assembly of the tree of life has seen significant progress in recent years but algae and protists have been largely overlooked in this effort. Many groups of algae and protists have ancient roots and it is unclear how much data will be required to resolve their phylogenetic relationships for incorporation in the tree of life. The red algae, a group of primary photosynthetic eukaryotes of more than a billion years old, provide the earliest fossil evidence for eukaryotic multicellularity and sexual reproduction. Despite this evolutionary significance, their phylogenetic relationships are understudied. This study aims to infer a comprehensive red algal tree of life at the family level from a supermatrix containing data mined from GenBank. We aim to locate remaining regions of low support in the topology, evaluate their causes and estimate the amount of data required to resolve them. Phylogenetic analysis of a supermatrix of 14 loci and 98 red algal families yielded the most complete red algal tree of life to date. Visualization of statistical support showed the presence of five poorly supported regions. Causes for low support were identified with statistics about the age of the region, data availability and node density, showing that poor support has different origins in different parts of the tree. Parametric simulation experiments yielded optimistic estimates of how much data will be needed to resolve the poorly supported regions (ca. 103 to ca. 104 nucleotides for the different regions). Nonparametric simulations gave a markedly more pessimistic image, some regions requiring more than 2.8 105 nucleotides or not achieving the desired level of support at all. The discrepancies between parametric and nonparametric simulations are discussed in light of our dataset and known attributes of both approaches. Our study takes the red algae one step closer to meaningful inclusion in the tree of life. In addition to the recovery of stable relationships, the recognition of five regions in need of further study is a significant outcome of this work. Based on our analyses of current availability and future requirements of data, we make clear recommendations for forthcoming research.
Irradiation-hyperthermia in canine hemangiopericytomas: large-animal model for therapeutic response.
Richardson, R C; Anderson, V L; Voorhees, W D; Blevins, W E; Inskeep, T K; Janas, W; Shupe, R E; Babbs, C F
1984-11-01
Results of irradiation-hyperthermia treatment in 11 dogs with naturally occurring hemangiopericytoma were reported. Similarities of canine and human hemangiopericytomas were described. Orthovoltage X-irradiation followed by microwave-induced hyperthermia resulted in a 91% objective response rate. A statistical procedure was given to evaluate quantitatively the clinical behavior of locally invasive, nonmetastatic tumors in dogs that were undergoing therapy for control of local disease. The procedure used a small sample size and demonstrated distribution of the data on a scaled response as well as transformation of the data through classical parametric and nonparametric statistical methods. These statistical methods set confidence limits on the population mean and placed tolerance limits on a population percentage. Application of the statistical methods to human and animal clinical trials was apparent.
NASA Astrophysics Data System (ADS)
Wu, Jingwen; Miao, Chiyuan; Tang, Xu; Duan, Qingyun; He, Xiaojia
2018-02-01
Drought is one of the world's most recurrent and destructive hazards, and the evolution of drought events has become increasingly complex against a background of climate change and changing human activities. Over the last five decades, there have been frequent droughts on the Loess Plateau in China. In this study, we used the nonparametric standardized runoff index (NSRI) to investigate the temporal characteristics of hydrological drought in 17 Loess Plateau catchments during the period 1961-2013. Furthermore, we used a cross-wavelet transform to reveal linkages between an El Niño-Southern Oscillation (ENSO) index and the NSRI series. The primary results indicated that the annual and seasonal NSRI series displayed statistically significantly downward trends in all catchments, with the only exception being the winter NSRI series in Yanhe. Furthermore, our analyses showed downward trends persisting into the future in all 17 catchments except Yanhe. We also found that, overall, the risk of hydrological drought was high on the Loess Plateau, with the mean duration at the seasonal scale exceeding 4 months and the mean duration at the annual scale exceeding 12 months. Moreover, during recent years, the trend towards hydrological drought was greater in the spring than in other seasons. ENSO events were closely associated with annual and seasonal hydrological drought on the Loess Plateau, and the impact of ENSO events was stronger in the southeast of the plateau than the northwest at both seasonal and annual scales. These results may provide valuable information about the evolutionary characteristics of hydrological drought across the Loess Plateau and may also be useful for predicting and mitigating future hydrological drought on the plateau.
Zou, Kelly H; Resnic, Frederic S; Talos, Ion-Florin; Goldberg-Zimring, Daniel; Bhagwat, Jui G; Haker, Steven J; Kikinis, Ron; Jolesz, Ferenc A; Ohno-Machado, Lucila
2005-10-01
Medical classification accuracy studies often yield continuous data based on predictive models for treatment outcomes. A popular method for evaluating the performance of diagnostic tests is the receiver operating characteristic (ROC) curve analysis. The main objective was to develop a global statistical hypothesis test for assessing the goodness-of-fit (GOF) for parametric ROC curves via the bootstrap. A simple log (or logit) and a more flexible Box-Cox normality transformations were applied to untransformed or transformed data from two clinical studies to predict complications following percutaneous coronary interventions (PCIs) and for image-guided neurosurgical resection results predicted by tumor volume, respectively. We compared a non-parametric with a parametric binormal estimate of the underlying ROC curve. To construct such a GOF test, we used the non-parametric and parametric areas under the curve (AUCs) as the metrics, with a resulting p value reported. In the interventional cardiology example, logit and Box-Cox transformations of the predictive probabilities led to satisfactory AUCs (AUC=0.888; p=0.78, and AUC=0.888; p=0.73, respectively), while in the brain tumor resection example, log and Box-Cox transformations of the tumor size also led to satisfactory AUCs (AUC=0.898; p=0.61, and AUC=0.899; p=0.42, respectively). In contrast, significant departures from GOF were observed without applying any transformation prior to assuming a binormal model (AUC=0.766; p=0.004, and AUC=0.831; p=0.03), respectively. In both studies the p values suggested that transformations were important to consider before applying any binormal model to estimate the AUC. Our analyses also demonstrated and confirmed the predictive values of different classifiers for determining the interventional complications following PCIs and resection outcomes in image-guided neurosurgery.
What Teachers Should Know About the Bootstrap: Resampling in the Undergraduate Statistics Curriculum
Hesterberg, Tim C.
2015-01-01
Bootstrapping has enormous potential in statistics education and practice, but there are subtle issues and ways to go wrong. For example, the common combination of nonparametric bootstrapping and bootstrap percentile confidence intervals is less accurate than using t-intervals for small samples, though more accurate for larger samples. My goals in this article are to provide a deeper understanding of bootstrap methods—how they work, when they work or not, and which methods work better—and to highlight pedagogical issues. Supplementary materials for this article are available online. [Received December 2014. Revised August 2015] PMID:27019512
Lucijanic, Marko; Petrovecki, Mladen
2012-01-01
Analyzing events over time is often complicated by incomplete, or censored, observations. Special non-parametric statistical methods were developed to overcome difficulties in summarizing and comparing censored data. Life-table (actuarial) method and Kaplan-Meier method are described with an explanation of survival curves. For the didactic purpose authors prepared a workbook based on most widely used Kaplan-Meier method. It should help the reader understand how Kaplan-Meier method is conceptualized and how it can be used to obtain statistics and survival curves needed to completely describe a sample of patients. Log-rank test and hazard ratio are also discussed.
Wedemeyer, Gary A.; Nelson, Nancy C.
1975-01-01
Gaussian and nonparametric (percentile estimate and tolerance interval) statistical methods were used to estimate normal ranges for blood chemistry (bicarbonate, bilirubin, calcium, hematocrit, hemoglobin, magnesium, mean cell hemoglobin concentration, osmolality, inorganic phosphorus, and pH for juvenile rainbow (Salmo gairdneri, Shasta strain) trout held under defined environmental conditions. The percentile estimate and Gaussian methods gave similar normal ranges, whereas the tolerance interval method gave consistently wider ranges for all blood variables except hemoglobin. If the underlying frequency distribution is unknown, the percentile estimate procedure would be the method of choice.
Second-hand smoking and carboxyhemoglobin levels in children: a prospective observational study.
Yee, Branden E; Ahmed, Mohammed I; Brugge, Doug; Farrell, Maureen; Lozada, Gustavo; Idupaganthi, Raghu; Schumann, Roman
2010-01-01
To establish baseline noninvasive carboxyhemoglobin (COHb) levels in children and determine the influence of exposure to environmental sources of carbon monoxide (CO), especially environmental tobacco smoke, on such levels. Second-hand smoking may be a risk factor for adverse outcomes following anesthesia and surgery in children (1) and may potentially be preventable. Parents and their children between the ages of 1-12 were enrolled on the day of elective surgery. The preoperative COHb levels of the children were assessed noninvasively using a CO-Oximeter (Radical-7 Rainbow SET Pulse CO-Oximeter; Masimo, Irvine, CA, USA). The parents were asked to complete an environmental air-quality questionnaire. The COHb levels were tabulated and correlated with responses to the survey in aggregate analysis. Statistical analyses were performed using the nonparametric Mann-Whitney and Kruskal-Wallis tests. P < 0.05 was statistically significant. Two hundred children with their parents were enrolled. Children exposed to parental smoking had higher COHb levels than the children of nonsmoking controls. Higher COHb values were seen in the youngest children, ages 1-2, exposed to parental cigarette smoke. However, these trends did not reach statistical significance, and confidence intervals were wide. This study revealed interesting trends of COHb levels in children presenting for anesthesia and surgery. However, the COHb levels measured in our patients were close to the error margin of the device used in our study. An expected improvement in measurement technology may allow screening children for potential pulmonary perioperative risk factors in the future.
Dynamic characteristics of oxygen consumption.
Ye, Lin; Argha, Ahmadreza; Yu, Hairong; Celler, Branko G; Nguyen, Hung T; Su, Steven
2018-04-23
Previous studies have indicated that oxygen uptake ([Formula: see text]) is one of the most accurate indices for assessing the cardiorespiratory response to exercise. In most existing studies, the response of [Formula: see text] is often roughly modelled as a first-order system due to the inadequate stimulation and low signal to noise ratio. To overcome this difficulty, this paper proposes a novel nonparametric kernel-based method for the dynamic modelling of [Formula: see text] response to provide a more robust estimation. Twenty healthy non-athlete participants conducted treadmill exercises with monotonous stimulation (e.g., single step function as input). During the exercise, [Formula: see text] was measured and recorded by a popular portable gas analyser ([Formula: see text], COSMED). Based on the recorded data, a kernel-based estimation method was proposed to perform the nonparametric modelling of [Formula: see text]. For the proposed method, a properly selected kernel can represent the prior modelling information to reduce the dependence of comprehensive stimulations. Furthermore, due to the special elastic net formed by [Formula: see text] norm and kernelised [Formula: see text] norm, the estimations are smooth and concise. Additionally, the finite impulse response based nonparametric model which estimated by the proposed method can optimally select the order and fit better in terms of goodness-of-fit comparing to classical methods. Several kernels were introduced for the kernel-based [Formula: see text] modelling method. The results clearly indicated that the stable spline (SS) kernel has the best performance for [Formula: see text] modelling. Particularly, based on the experimental data from 20 participants, the estimated response from the proposed method with SS kernel was significantly better than the results from the benchmark method [i.e., prediction error method (PEM)] ([Formula: see text] vs [Formula: see text]). The proposed nonparametric modelling method is an effective method for the estimation of the impulse response of VO 2 -Speed system. Furthermore, the identified average nonparametric model method can dynamically predict [Formula: see text] response with acceptable accuracy during treadmill exercise.
A hybrid method in combining treatment effects from matched and unmatched studies.
Byun, Jinyoung; Lai, Dejian; Luo, Sheng; Risser, Jan; Tung, Betty; Hardy, Robert J
2013-12-10
The most common data structures in the biomedical studies have been matched or unmatched designs. Data structures resulting from a hybrid of the two may create challenges for statistical inferences. The question may arise whether to use parametric or nonparametric methods on the hybrid data structure. The Early Treatment for Retinopathy of Prematurity study was a multicenter clinical trial sponsored by the National Eye Institute. The design produced data requiring a statistical method of a hybrid nature. An infant in this multicenter randomized clinical trial had high-risk prethreshold retinopathy of prematurity that was eligible for treatment in one or both eyes at entry into the trial. During follow-up, recognition visual acuity was accessed for both eyes. Data from both eyes (matched) and from only one eye (unmatched) were eligible to be used in the trial. The new hybrid nonparametric method is a meta-analysis based on combining the Hodges-Lehmann estimates of treatment effects from the Wilcoxon signed rank and rank sum tests. To compare the new method, we used the classic meta-analysis with the t-test method to combine estimates of treatment effects from the paired and two sample t-tests. We used simulations to calculate the empirical size and power of the test statistics, as well as the bias, mean square and confidence interval width of the corresponding estimators. The proposed method provides an effective tool to evaluate data from clinical trials and similar comparative studies. Copyright © 2013 John Wiley & Sons, Ltd.
Friedrich, Verena; Brügger, Adrian; Bauer, Georg F
2015-01-01
Evidence based public health requires knowledge about successful dissemination of public health measures. This study analyses (a) the changes in worksite tobacco prevention (TP) in the Canton of Zurich, Switzerland, between 2007 and 2009; (b1) the results of a multistep versus a "brochure only" dissemination strategy; (b2) the results of a monothematic versus a comprehensive dissemination strategy that aim to get companies to adopt TP measures; and (c) whether worksite TP is associated with health-related outcomes. A longitudinal design with randomized control groups was applied. Data on worksite TP and health-related outcomes were gathered by a written questionnaire (baseline n = 1627; follow-up n = 1452) and analysed using descriptive statistics, nonparametric procedures, and ordinal regression models. TP measures at worksites improved slightly between 2007 and 2009. The multistep dissemination was superior to the "brochure only" condition. No significant differences between the monothematic and the comprehensive dissemination strategies were observed. However, improvements in TP measures at worksites were associated with improvements in health-related outcomes. Although dissemination was approached at a mass scale, little change in the advocated adoption of TP measures was observed, suggesting the need for even more aggressive outreach or an acceptance that these channels do not seem to be sufficiently effective.
Friedrich, Verena; Brügger, Adrian; Bauer, Georg F.
2015-01-01
Evidence based public health requires knowledge about successful dissemination of public health measures. This study analyses (a) the changes in worksite tobacco prevention (TP) in the Canton of Zurich, Switzerland, between 2007 and 2009; (b1) the results of a multistep versus a “brochure only” dissemination strategy; (b2) the results of a monothematic versus a comprehensive dissemination strategy that aim to get companies to adopt TP measures; and (c) whether worksite TP is associated with health-related outcomes. A longitudinal design with randomized control groups was applied. Data on worksite TP and health-related outcomes were gathered by a written questionnaire (baseline n = 1627; follow-up n = 1452) and analysed using descriptive statistics, nonparametric procedures, and ordinal regression models. TP measures at worksites improved slightly between 2007 and 2009. The multistep dissemination was superior to the “brochure only” condition. No significant differences between the monothematic and the comprehensive dissemination strategies were observed. However, improvements in TP measures at worksites were associated with improvements in health-related outcomes. Although dissemination was approached at a mass scale, little change in the advocated adoption of TP measures was observed, suggesting the need for even more aggressive outreach or an acceptance that these channels do not seem to be sufficiently effective. PMID:26504778
Ponciano, José Miguel
2017-11-22
Using a nonparametric Bayesian approach Palacios and Minin (2013) dramatically improved the accuracy, precision of Bayesian inference of population size trajectories from gene genealogies. These authors proposed an extension of a Gaussian Process (GP) nonparametric inferential method for the intensity function of non-homogeneous Poisson processes. They found that not only the statistical properties of the estimators were improved with their method, but also, that key aspects of the demographic histories were recovered. The authors' work represents the first Bayesian nonparametric solution to this inferential problem because they specify a convenient prior belief without a particular functional form on the population trajectory. Their approach works so well and provides such a profound understanding of the biological process, that the question arises as to how truly "biology-free" their approach really is. Using well-known concepts of stochastic population dynamics, here I demonstrate that in fact, Palacios and Minin's GP model can be cast as a parametric population growth model with density dependence and environmental stochasticity. Making this link between population genetics and stochastic population dynamics modeling provides novel insights into eliciting biologically meaningful priors for the trajectory of the effective population size. The results presented here also bring novel understanding of GP as models for the evolution of a trait. Thus, the ecological principles foundation of Palacios and Minin (2013)'s prior adds to the conceptual and scientific value of these authors' inferential approach. I conclude this note by listing a series of insights brought about by this connection with Ecology. Copyright © 2017 The Author. Published by Elsevier Inc. All rights reserved.
Robustness of S1 statistic with Hodges-Lehmann for skewed distributions
NASA Astrophysics Data System (ADS)
Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping
2016-10-01
Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.
NASA Astrophysics Data System (ADS)
Curceac, S.; Ternynck, C.; Ouarda, T.
2015-12-01
Over the past decades, a substantial amount of research has been conducted to model and forecast climatic variables. In this study, Nonparametric Functional Data Analysis (NPFDA) methods are applied to forecast air temperature and wind speed time series in Abu Dhabi, UAE. The dataset consists of hourly measurements recorded for a period of 29 years, 1982-2010. The novelty of the Functional Data Analysis approach is in expressing the data as curves. In the present work, the focus is on daily forecasting and the functional observations (curves) express the daily measurements of the above mentioned variables. We apply a non-linear regression model with a functional non-parametric kernel estimator. The computation of the estimator is performed using an asymmetrical quadratic kernel function for local weighting based on the bandwidth obtained by a cross validation procedure. The proximities between functional objects are calculated by families of semi-metrics based on derivatives and Functional Principal Component Analysis (FPCA). Additionally, functional conditional mode and functional conditional median estimators are applied and the advantages of combining their results are analysed. A different approach employs a SARIMA model selected according to the minimum Akaike (AIC) and Bayessian (BIC) Information Criteria and based on the residuals of the model. The performance of the models is assessed by calculating error indices such as the root mean square error (RMSE), relative RMSE, BIAS and relative BIAS. The results indicate that the NPFDA models provide more accurate forecasts than the SARIMA models. Key words: Nonparametric functional data analysis, SARIMA, time series forecast, air temperature, wind speed
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Schloss, Patrick D; Handelsman, Jo
2006-10-01
The recent advent of tools enabling statistical inferences to be drawn from comparisons of microbial communities has enabled the focus of microbial ecology to move from characterizing biodiversity to describing the distribution of that biodiversity. Although statistical tools have been developed to compare community structures across a phylogenetic tree, we lack tools to compare the memberships and structures of two communities at a particular operational taxonomic unit (OTU) definition. Furthermore, current tests of community structure do not indicate the similarity of the communities but only report the probability of a statistical hypothesis. Here we present a computer program, SONS, which implements nonparametric estimators for the fraction and richness of OTUs shared between two communities.
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
Beasley, T. Mark
2013-01-01
Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952
1987-09-01
long been recognized as powerful nonparametric statistical methods since the introduction of the principal ideas by R.A. Fisher in 1935 . Even when...couldIatal eoand1rmSncepriet ::’x.OUld st:ll have to he - epre -erte-.. I-Itma’v in any: ph,:sIcal computing devl’:c by a C\\onux ot bit,Aa n. the
Statistical Models and Inference Procedures for Structural and Materials Reliability
1990-12-01
as an official Department of the Army positio~n, policy, or decision, unless sD designated by other documentazion. 12a. DISTRIBUTION /AVAILABILITY...Some general stress-strength models were also developed and applied to the failure of systems subject to cyclic loading. Involved in the failure of...process control ideas and sequential design and analysis methods. Finally, smooth nonparametric quantile .wJ function estimators were studied. All of
ERIC Educational Resources Information Center
Douglas, Pamela A.
2013-01-01
This quantitative, nonexperimental study used survey research design and nonparametric statistics to investigate Birnbaum's (1988) theory that there is a relationship between the constructs of leadership and organization, as depicted in his five higher education models of organizational functioning: bureaucratic, collegial, political,…
ERIC Educational Resources Information Center
Cela-Ranilla, Jose María; Esteve-Gonzalez, Vanessa; Esteve-Mon, Francesc; Gisbert-Cervera, Merce
2014-01-01
In this study we analyze how 57 Spanish university students of Education developed a learning process in a virtual world by conducting activities that involved the skill of self-management. The learning experience comprised a serious game designed in a 3D simulation environment. Descriptive statistics and non-parametric tests were used in the…
NASA Astrophysics Data System (ADS)
Lototzis, M.; Papadopoulos, G. K.; Droulia, F.; Tseliou, A.; Tsiros, I. X.
2018-04-01
There are several cases where a circular variable is associated with a linear one. A typical example is wind direction that is often associated with linear quantities such as air temperature and air humidity. The analysis of a statistical relationship of this kind can be tested by the use of parametric and non-parametric methods, each of which has its own advantages and drawbacks. This work deals with correlation analysis using both the parametric and the non-parametric procedure on a small set of meteorological data of air temperature and wind direction during a summer period in a Mediterranean climate. Correlations were examined between hourly, daily and maximum-prevailing values, under typical and non-typical meteorological conditions. Both tests indicated a strong correlation between mean hourly wind directions and mean hourly air temperature, whereas mean daily wind direction and mean daily air temperature do not seem to be correlated. In some cases, however, the two procedures were found to give quite dissimilar levels of significance on the rejection or not of the null hypothesis of no correlation. The simple statistical analysis presented in this study, appropriately extended in large sets of meteorological data, may be a useful tool for estimating effects of wind on local climate studies.
Matthews, Abigail G; Hoffman, Eric K; Zezza, Nicholas; Stiffler, Scott; Hill, Shirley Y
2007-09-01
The genes encoding the gamma-aminobutyric acid(A) (GABA(A)) receptor have been the focus of several recent studies investigating the genetic etiology of alcohol dependence. Analyses of multiplex families found a particular gene, GABRA2, to be highly associated with alcohol dependence, using within-family association tests and other methods. Results were confirmed in three case-control studies. The objective of this study was to investigate the GABRA2 gene in another collection of multiplex families. Analyses were based on phenotypic and genotypic data available for 330 individuals from 65 bigenerational pedigrees with a total of 232 alcohol-dependent subjects. A proband pair of same-sex siblings meeting Diagnostic and Statistical Manual of Mental Disorders, Third Edition, criteria for alcohol dependence was required for entry of a family into the study. One member of the proband pair was identified while in treatment for alcohol dependence. Linkage and association of GABRA2 and alcohol dependence were evaluated using SIBPAL (a nonparametric linkage package) and both the Pedigree Disequilibrium Test and the Family-Based Association Test, respectively. We find no evidence of a relationship between GABRA2 and alcohol dependence. Linkage analyses exhibited no linkage using affected/affected, affected/unaffected, and unaffected/unaffected sib pairs (all p's < .13). There was no evidence of a within-family association (all p's > .39). Comorbidity may explain why our results differ from those in the literature. The presence of primary drug dependence and/or other psychiatric disorders is minimal in our pedigrees, although several of the other previously published multiplex family analyses exhibit a greater degree of comorbidity.
NASA Astrophysics Data System (ADS)
Martin-Hernandez, Natalia; Vicente-Serrano, Sergio; Azorin-Molina, Cesar; Begueria-Portugues, Santiago; Reig-Gracia, Fergus; Zabalza-Martínez, Javier
2017-04-01
We have analysed trends in the Normalized Difference Vegetation Index (NDVI) in the Iberian Peninsula and The Balearic Islands over the period 1981 - 2015 using a new high resolution data set from the entire available NOAA - AVHRR images (IBERIAN NDVI dataset). After a complete processing including geocoding, calibration, cloud removal, topographic correction and temporal filtering, we obtained bi-weekly time series. To assess the accuracy of the new IBERIAN NDVI time-series, we have compared temporal variability and trends of NDVI series with those results reported by GIMMS 3g and MODIS (MOD13A3) NDVI datasets. In general, the IBERIAN NDVI showed high reliability with these two products but showing higher spatial resolution than the GIMMS dataset and covering two more decades than the MODIS dataset. Using the IBERIAN NDVI dataset, we analysed NDVI trends by means of the non-parametric Mann-Kendall test and Theil-Sen slope estimator. In average, vegetation trends in the study area show an increase over the last decades. However, there are local spatial differences: the main increase has been recorded in humid regions of the north of the Iberian Peninsula. The statistical techniques allow finding abrupt and gradual changes in different land cover types during the analysed period. These changes are related with human activity due to land transformations (from dry to irrigated land), land abandonment and forest recovery.
Villanueva, Pia; Newbury, Dianne F; Jara, Lilian; De Barbieri, Zulema; Mirza, Ghazala; Palomino, Hernán M; Fernández, María Angélica; Cazier, Jean-Baptiste; Monaco, Anthony P; Palomino, Hernán
2011-01-01
Specific language impairment (SLI) is an unexpected deficit in the acquisition of language skills and affects between 5 and 8% of pre-school children. Despite its prevalence and high heritability, our understanding of the aetiology of this disorder is only emerging. In this paper, we apply genome-wide techniques to investigate an isolated Chilean population who exhibit an increased frequency of SLI. Loss of heterozygosity (LOH) mapping and parametric and non-parametric linkage analyses indicate that complex genetic factors are likely to underlie susceptibility to SLI in this population. Across all analyses performed, the most consistently implicated locus was on chromosome 7q. This locus achieved highly significant linkage under all three non-parametric models (max NPL=6.73, P=4.0 × 10−11). In addition, it yielded a HLOD of 1.24 in the recessive parametric linkage analyses and contained a segment that was homozygous in two affected individuals. Further, investigation of this region identified a two-SNP haplotype that occurs at an increased frequency in language-impaired individuals (P=0.008). We hypothesise that the linkage regions identified here, in particular that on chromosome 7, may contain variants that underlie the high prevalence of SLI observed in this isolated population and may be of relevance to other populations affected by language impairments. PMID:21248734
New approaches to the analysis of population trends in land birds: Comment
Link, W.A.; Sauer, J.R.
1997-01-01
James et al. (1996, Ecology 77:13-27) used data from the North American Breeding Bird Survey (BBS) to examine geographic variability in patterns of population change for 26 species of wood warblers. They emphasized the importance of evaluating nonlinear patterns of change in bird populations, proposed LOESS-based non-parametric and semi-parametric analyses of BBS data, and contrasted their results with other analyses, including those of Robbins et al. (1989, Proceedings of the National Academy of Sciences 86: 7658-7662) and Peterjohn et al. (1995, Pages 3-39 in T. E. Martin and D. M. Finch, eds. Ecology and management of Neotropical migratory birds: a synthesis and review of critical issues. Oxford University Press, New York.). In this note, we briefly comment on some of the issues that arose from their analysis of BBS data, suggest a few aspects of the survey that should inspire caution in analysts, and review the differences between the LOESS-based procedures and other procedures (e.g., Link and Sauer 1994). We strongly discourage the use of James et al.'s completely non-parametric procedure, which fails to account for observer effects. Our comparisons of estimators adds to the evidence already present in the literature of the bias associated with omitting observer information in analyses of BBS data. Bias resulting from change in observer abilities should be a consideration in any analysis of BBS data.
Robust Machine Learning Variable Importance Analyses of Medical Conditions for Health Care Spending.
Rose, Sherri
2018-03-11
To propose nonparametric double robust machine learning in variable importance analyses of medical conditions for health spending. 2011-2012 Truven MarketScan database. I evaluate how much more, on average, commercially insured enrollees with each of 26 of the most prevalent medical conditions cost per year after controlling for demographics and other medical conditions. This is accomplished within the nonparametric targeted learning framework, which incorporates ensemble machine learning. Previous literature studying the impact of medical conditions on health care spending has almost exclusively focused on parametric risk adjustment; thus, I compare my approach to parametric regression. My results demonstrate that multiple sclerosis, congestive heart failure, severe cancers, major depression and bipolar disorders, and chronic hepatitis are the most costly medical conditions on average per individual. These findings differed from those obtained using parametric regression. The literature may be underestimating the spending contributions of several medical conditions, which is a potentially critical oversight. If current methods are not capturing the true incremental effect of medical conditions, undesirable incentives related to care may remain. Further work is needed to directly study these issues in the context of federal formulas. © Health Research and Educational Trust.
Model Robust Calibration: Method and Application to Electronically-Scanned Pressure Transducers
NASA Technical Reports Server (NTRS)
Walker, Eric L.; Starnes, B. Alden; Birch, Jeffery B.; Mays, James E.
2010-01-01
This article presents the application of a recently developed statistical regression method to the controlled instrument calibration problem. The statistical method of Model Robust Regression (MRR), developed by Mays, Birch, and Starnes, is shown to improve instrument calibration by reducing the reliance of the calibration on a predetermined parametric (e.g. polynomial, exponential, logarithmic) model. This is accomplished by allowing fits from the predetermined parametric model to be augmented by a certain portion of a fit to the residuals from the initial regression using a nonparametric (locally parametric) regression technique. The method is demonstrated for the absolute scale calibration of silicon-based pressure transducers.
Invariance in the recurrence of large returns and the validation of models of price dynamics
NASA Astrophysics Data System (ADS)
Chang, Lo-Bin; Geman, Stuart; Hsieh, Fushing; Hwang, Chii-Ruey
2013-08-01
Starting from a robust, nonparametric definition of large returns (“excursions”), we study the statistics of their occurrences, focusing on the recurrence process. The empirical waiting-time distribution between excursions is remarkably invariant to year, stock, and scale (return interval). This invariance is related to self-similarity of the marginal distributions of returns, but the excursion waiting-time distribution is a function of the entire return process and not just its univariate probabilities. Generalized autoregressive conditional heteroskedasticity (GARCH) models, market-time transformations based on volume or trades, and generalized (Lévy) random-walk models all fail to fit the statistical structure of excursions.
Martina, R; Kay, R; van Maanen, R; Ridder, A
2015-01-01
Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non-parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.
Cosgarea, Raluca; Gasparik, Cristina; Dudea, Diana; Culic, Bogdan; Dannewitz, Bettina; Sculean, Anton
2015-05-01
To objectively determine the difference in colour between the peri-implant soft tissue at titanium and zirconia abutments. Eleven patients, each with two contralaterally inserted osteointegrated dental implants, were included in this study. The implants were restored either with titanium abutments and porcelain-fused-to-metal crowns, or with zirconia abutments and ceramic crowns. Prior and after crown cementation, multi-spectral images of the peri-implant soft tissues and the gingiva of the neighbouring teeth were taken with a colorimeter. The colour parameters L*, a*, b*, c* and the colour differences ΔE were calculated. Descriptive statistics, including non-parametric tests and correlation coefficients, were used for statistical analyses of the data. Compared to the gingiva of the neighbouring teeth, the peri-implant soft tissue around titanium and zirconia (test group), showed distinguishable ΔE both before and after crown cementation. Colour differences around titanium were statistically significant different (P = 0.01) only at 1 mm prior to crown cementation compared to zirconia. Compared to the gingiva of the neighbouring teeth, statistically significant (P < 0.01) differences were found for all colour parameter, either before or after crown cementation for both abutments; more significant differences were registered for titanium abutments. Tissue thickness correlated positively with c*-values for titanium at 1 mm and 2 mm from the gingival margin. Within their limits, the present data indicate that: (i) The peri-implant soft tissue around titanium and zirconia showed colour differences when compared to the soft tissue around natural teeth, and (ii) the peri-implant soft tissue around zirconia demonstrated a better colour match to the soft tissue at natural teeth than titanium. © 2014 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Out-of-Sample Extensions for Non-Parametric Kernel Methods.
Pan, Binbin; Chen, Wen-Sheng; Chen, Bo; Xu, Chen; Lai, Jianhuang
2017-02-01
Choosing suitable kernels plays an important role in the performance of kernel methods. Recently, a number of studies were devoted to developing nonparametric kernels. Without assuming any parametric form of the target kernel, nonparametric kernel learning offers a flexible scheme to utilize the information of the data, which may potentially characterize the data similarity better. The kernel methods using nonparametric kernels are referred to as nonparametric kernel methods. However, many nonparametric kernel methods are restricted to transductive learning, where the prediction function is defined only over the data points given beforehand. They have no straightforward extension for the out-of-sample data points, and thus cannot be applied to inductive learning. In this paper, we show how to make the nonparametric kernel methods applicable to inductive learning. The key problem of out-of-sample extension is how to extend the nonparametric kernel matrix to the corresponding kernel function. A regression approach in the hyper reproducing kernel Hilbert space is proposed to solve this problem. Empirical results indicate that the out-of-sample performance is comparable to the in-sample performance in most cases. Experiments on face recognition demonstrate the superiority of our nonparametric kernel method over the state-of-the-art parametric kernel methods.
NASA Astrophysics Data System (ADS)
Thomas, M. A.
2016-12-01
The Waste Isolation Pilot Plant (WIPP) is the only deep geological repository for transuranic waste in the United States. As the Science Advisor for the WIPP, Sandia National Laboratories annually evaluates site data against trigger values (TVs), metrics whose violation is indicative of conditions that may impact long-term repository performance. This study focuses on a groundwater-quality dataset used to redesign a TV for the Culebra Dolomite Member (Culebra) of the Permian-age Rustler Formation. Prior to this study, a TV violation occurred if the concentration of a major ion fell outside a range defined as the mean +/- two standard deviations. The ranges were thought to denote conditions that 95% of future values would fall within. Groundwater-quality data used in evaluating compliance, however, are rarely normally distributed. To create a more robust Culebra groundwater-quality TV, this study employed the randomization test, a non-parametric permutation method. Recent groundwater compositions considered TV violations under the original ion concentration ranges are now interpreted as false positives in light of the insignificant p-values calculated with the randomization test. This work highlights that the normality assumption can weaken as the size of a groundwater-quality dataset grows over time. Non-parametric permutation methods are an attractive option because no assumption about the statistical distribution is required and calculating all combinations of the data is an increasingly tractable problem with modern workstations. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. This research is funded by WIPP programs administered by the Office of Environmental Management (EM) of the U.S. Department of Energy. SAND2016-7306A
Kobayashi, Toshiki; Orendurff, Michael S.; Singer, Madeline L.; Gao, Fan; Daly, Wayne K.; Foreman, K. Bo
2016-01-01
Background Genu recurvatum (knee hyperextension) is a common issue for individuals post stroke. Ankle-foot orthoses are used to improve genu recurvatum, but evidence is limited concerning their effectiveness. Therefore, the aim of this study was to investigate the effect of changing the plantarflexion resistance of an articulated ankle-foot orthosis on genu recurvatum in patients post stroke. Methods Gait analysis was performed on 6 individuals post stroke with genu recurvatum using an articulated ankle-foot orthosis whose plantarflexion resistance was adjustable at four levels. Gait data were collected using a Bertec split-belt instrumented treadmill in a 3-dimensional motion analysis laboratory. Gait parameters were extracted and plotted for each subject under the four plantarflexion resistance conditions of the ankle-foot orthosis. Gait parameters included: a) peak ankle plantarflexion angle, b) peak ankle dorsiflexion moment, c) peak knee extension angle and d) peak knee flexion moment. A non-parametric Friedman test was performed followed by a post-hoc Wilcoxon Signed-Rank test for statistical analyses. Findings All the gait parameters demonstrated statistically significant differences among the four resistance conditions of the AFO. Increasing the amount of plantarflexion resistance of the ankle-foot orthosis generally reduced genu recurvatum in all subjects. However, individual analyses showed that the responses to the changes in the plantarflexion resistance of the AFO were not necessarily linear, and appear unique to each subject. Interpretations The plantarflexion resistance of an articulated AFO should be adjusted to improve genu recurvatum in patients post stroke. Future studies should investigate what clinical factors would influence the individual differences. PMID:27136122
Roten, Linda Tømmerdal; Thomsen, Liv Cecilie Vestrheim; Gundersen, Astrid Solberg; Fenstad, Mona Høysæter; Odland, Maria Lisa; Strand, Kristin Melheim; Solberg, Per; Tappert, Christian; Araya, Elisabeth; Bærheim, Gunhild; Lyslo, Ingvill; Tollaksen, Kjersti; Bjørge, Line; Austgulen, Rigmor
2015-12-01
Preeclampsia is a major pregnancy complication without curative treatment available. A Norwegian Preeclampsia Family Cohort was established to provide a new resource for genetic and molecular studies aiming to improve the understanding of the complex pathophysiology of preeclampsia. Participants were recruited from five Norwegian hospitals after diagnoses of preeclampsia registered in the Medical birth registry of Norway were verified according to the study's inclusion criteria. Detailed obstetric information and information on personal and family disease history focusing on cardiovascular health was collected. At attendance anthropometric measurements were registered and blood samples were drawn. The software package SPSS 19.0 for Windows was used to compute descriptive statistics such as mean and SD. P-values were computed based on t-test statistics for normally distributed variables. Nonparametrical methods (chi square) were used for categorical variables. A cohort consisting of 496 participants (355 females and 141 males) representing 137 families with increased occurrence of preeclampsia has been established, and blood samples are available for 477 participants. Descriptive analyses showed that about 60% of the index women's pregnancies with birth data registered were preeclamptic according to modern diagnosis criteria. We also found that about 41% of the index women experienced more than one preeclamptic pregnancy. In addition, the descriptive analyses confirmed that preeclamptic pregnancies are more often accompanied with delivery complications. The data and biological samples collected in this Norwegian Preeclampsia Family Cohort will provide an important basis for future research. Identification of preeclampsia susceptibility genes and new biomarkers may contribute to more efficient strategies to identify mothers "at risk" and contribute to development of novel preventative therapies.
Bayesian Nonparametric Prediction and Statistical Inference
1989-09-07
Kadane, J. (1980), "Bayesian decision theory and the sim- plification of models," in Evaluation of Econometric Models, J. Kmenta and J. Ramsey , eds...the random model and weighted least squares regression," in Evaluation of Econometric Models, ed. by J. Kmenta and J. Ramsey , Academic Press, 197-217...likelihood function. On the other hand, H. Jeffreys’s theory of hypothesis testing covers the most important situations in which the prior is not diffuse. See
Some New Approaches to Multivariate Probability Distributions.
1986-12-01
Krishnaiah (1977). The following example may serve as an illustration of this point. EXAMPLE 2. (Fre^*chet’s bivariate continuous distribution...the error in the theorem of "" Prakasa Rao (1974) and to Dr. P.R. Krishnaiah for his valuable comments on the initial draft, his monumental patience and...M. and Proschan, F. (1984). Nonparametric Concepts and Methods in Reliability, Handbook of Statistics, 4, 613-655, (eds. P.R. Krishnaiah and P.K
ERIC Educational Resources Information Center
Monahan, Patrick O.; Ankenmann, Robert D.
2010-01-01
When the matching score is either less than perfectly reliable or not a sufficient statistic for determining latent proficiency in data conforming to item response theory (IRT) models, Type I error (TIE) inflation may occur for the Mantel-Haenszel (MH) procedure or any differential item functioning (DIF) procedure that matches on summed-item…
Landscape metrics for assessment of landscape destruction and rehabilitation.
Herzog, F; Lausch, A; Müller, E; Thulke, H H; Steinhardt, U; Lehmann, S
2001-01-01
This investigation tested the usefulness of geometry-based landscape metrics for monitoring landscapes in a heavily disturbed environment. Research was carried out in a 75 sq km study area in Saxony, eastern Germany, where the landscape has been affected by surface mining and agricultural intensification. Landscape metrics were calculated from digital maps (1912, 1944, 1973, 1989) for the entire study area and for subregions (river valleys, plains), which were defined using the original geology and topography of the region. Correlation and factor analyses were used to select a set of landscape metrics suitable for landscape monitoring. Little land-use change occurred in the first half of the century, but political decisions and technological developments led to considerable change later. Metrics showed a similar pattern with almost no change between 1912 and 1944, but dramatic changes after 1944. Nonparametric statistical methods were used to test whether metrics differed between river valleys and plains. Significant differences in the metrics for these regions were found in the early maps (1912, 1944), but these differences were not significant in 1973 or 1989. These findings indicate that anthropogenic influences created a more home geneous landscape.
The precuneus may encode irrationality in human gambling.
Sacre, P; Kerr, M S D; Subramanian, S; Kahn, K; Gonzalez-Martinez, J; Johnson, M A; Sarma, S V; Gale, J T
2016-08-01
Humans often make irrational decisions, especially psychiatric patients who have dysfunctional cognitive and emotional circuitry. Understanding the neural basis of decision-making is therefore essential towards patient management, yet current studies suffer from several limitations. Functional magnetic resonance imaging (fMRI) studies in humans have dominated decision-making neuroscience, but have poor temporal resolution and the blood oxygenation level-dependent signal is only a proxy for neural activity. On the other hand, lesion studies in humans used to infer functionality in decision-making lack characterization of neural activity altogether. Using a combination of local field potential recordings in human subjects performing a financial decision-making task, spectral analyses, and non-parametric cluster statistics, we analyzed the activity in the precuneus. In nine subjects, the neural activity modulated significantly between rational and irrational trials in the precuneus (p <; 0.001). In particular, high-frequency activity (70-100 Hz) increased when irrational decisions were made. Although preliminary, these results suggest suppression of gamma rhythms via electrical stimulation in the precuneus as a therapeutic intervention for pathological decision-making.
Mesgouez, C; Rilliard, F; Matossian, L; Nassiri, K; Mandel, E
2003-03-01
The aim of this study was to determine the influence of operator experience on the time needed for canal preparation when using a rotary nickel-titanium (Ni-Ti) system. A total of 100 simulated curved canals in resin blocks were used. Four operators prepared a total of 25 canals each. The operators included practitioners with prior experience of the preparation technique, and practitioners with no experience. The working length for each instrument was precisely predetermined. All canals were instrumented with rotary Ni-Ti ProFile Variable Taper Series 29 engine-driven instruments using a high-torque handpiece (Maillefer, Ballaigues, Switzerland). The time taken to prepare each canal was recorded. Significant differences between the operators were analysed using the Student's t-test and the Kruskall-Wallis and Dunn nonparametric tests. Comparison of canal preparation times demonstrated a statistically significant difference between the four operators (P < 0.001). In the inexperienced group, a significant linear regression between canal number and preparation time occurred. Time required for canal preparation was inversely related to operator experience.
Some analysis on the diurnal variation of rainfall over the Atlantic Ocean
NASA Technical Reports Server (NTRS)
Gill, T.; Perng, S.; Hughes, A.
1981-01-01
Data collected from the GARP Atlantic Tropical Experiment (GATE) was examined. The data were collected from 10,000 grid points arranged as a 100 x 100 array; each grid covered a 4 square km area. The amount of rainfall was measured every 15 minutes during the experiment periods using c-band radars. Two types of analyses were performed on the data: analysis of diurnal variation was done on each of grid points based on the rainfall averages at noon and at midnight, and time series analysis on selected grid points based on the hourly averages of rainfall. Since there are no known distribution model which best describes the rainfall amount, nonparametric methods were used to examine the diurnal variation. Kolmogorov-Smirnov test was used to test if the rainfalls at noon and at midnight have the same statistical distribution. Wilcoxon signed-rank test was used to test if the noon rainfall is heavier than, equal to, or lighter than the midnight rainfall. These tests were done on each of the 10,000 grid points at which the data are available.
NASA Astrophysics Data System (ADS)
Yang, Yang; Peng, Zhike; Dong, Xingjian; Zhang, Wenming; Clifton, David A.
2018-03-01
A challenge in analysing non-stationary multi-component signals is to isolate nonlinearly time-varying signals especially when they are overlapped in time and frequency plane. In this paper, a framework integrating time-frequency analysis-based demodulation and a non-parametric Gaussian latent feature model is proposed to isolate and recover components of such signals. The former aims to remove high-order frequency modulation (FM) such that the latter is able to infer demodulated components while simultaneously discovering the number of the target components. The proposed method is effective in isolating multiple components that have the same FM behavior. In addition, the results show that the proposed method is superior to generalised demodulation with singular-value decomposition-based method, parametric time-frequency analysis with filter-based method and empirical model decomposition base method, in recovering the amplitude and phase of superimposed components.
Yu, Jihnhee; Yang, Luge; Vexler, Albert; Hutson, Alan D
2016-06-15
The receiver operating characteristic (ROC) curve is a popular technique with applications, for example, investigating an accuracy of a biomarker to delineate between disease and non-disease groups. A common measure of accuracy of a given diagnostic marker is the area under the ROC curve (AUC). In contrast with the AUC, the partial area under the ROC curve (pAUC) looks into the area with certain specificities (i.e., true negative rate) only, and it can be often clinically more relevant than examining the entire ROC curve. The pAUC is commonly estimated based on a U-statistic with the plug-in sample quantile, making the estimator a non-traditional U-statistic. In this article, we propose an accurate and easy method to obtain the variance of the nonparametric pAUC estimator. The proposed method is easy to implement for both one biomarker test and the comparison of two correlated biomarkers because it simply adapts the existing variance estimator of U-statistics. In this article, we show accuracy and other advantages of the proposed variance estimation method by broadly comparing it with previously existing methods. Further, we develop an empirical likelihood inference method based on the proposed variance estimator through a simple implementation. In an application, we demonstrate that, depending on the inferences by either the AUC or pAUC, we can make a different decision on a prognostic ability of a same set of biomarkers. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Orhan, Ozlem; Sagir, Mehmet; Zorba, Erdal
2013-06-01
This study compared the somatotype values of football players according to their playing positions. The study aimed to determine the physical profiles of players and to analyze the relationships between somatotypes and playing positions. Study participants were members of two teams in the Turkey Professional Football League, Gençlerbirligi Sports Team (GB) (N = 24) and Gençlerbirligi Oftas Sports Team (GBO) (N = 24). Anthropometric measurements of the players were performed according to techniques suggested by the Anthropometric Standardization Reference Manual (ASRM) and International Biological Program (IBP). In somatotype calculations, triceps, subscapular, supraspinale and calf skinfold thickness, humerus bicondylar, femur bicondylar, biceps circumference, calf circumference and body weight and height were used. Statistical analysis of the data was performed using the Graph Pad prism Version 5.00 for Windows (Graph Pad Software, San Diego California USA); somatotype calculations and analyses used the Somatotype 1.1 program and graphical representations of the results were produced. Analysis of non-parametric (two independent samples) Mann-Whitney U Test of the player data showed that there were no statistically significant differences between the two teams. The measurements indicated that, when all of the GB and GBO players were evaluated collectively, their average somatotypes were balanced mesomorph. The somatotypes of GBO goalkeepers were generally ectomorphic mesomorph; GB goalkeepers were balanced mesomorphic, although they were slightly endomorphic.
Han, Hyemin; Glenn, Andrea L
2018-06-01
In fMRI research, the goal of correcting for multiple comparisons is to identify areas of activity that reflect true effects, and thus would be expected to replicate in future studies. Finding an appropriate balance between trying to minimize false positives (Type I error) while not being too stringent and omitting true effects (Type II error) can be challenging. Furthermore, the advantages and disadvantages of these types of errors may differ for different areas of study. In many areas of social neuroscience that involve complex processes and considerable individual differences, such as the study of moral judgment, effects are typically smaller and statistical power weaker, leading to the suggestion that less stringent corrections that allow for more sensitivity may be beneficial and also result in more false positives. Using moral judgment fMRI data, we evaluated four commonly used methods for multiple comparison correction implemented in Statistical Parametric Mapping 12 by examining which method produced the most precise overlap with results from a meta-analysis of relevant studies and with results from nonparametric permutation analyses. We found that voxelwise thresholding with familywise error correction based on Random Field Theory provides a more precise overlap (i.e., without omitting too few regions or encompassing too many additional regions) than either clusterwise thresholding, Bonferroni correction, or false discovery rate correction methods.
The long-term use of benzodiazepines: patients' views, accounts and experiences.
Barter, G; Cormack, M
1996-12-01
Although a decrease in new prescribing has occurred for anxiolytic benzodiazepines, concerns have been raised that a 'core' of long-term users has been left behind. Typically, elderly people represent this 'core', using the benzodiazepines as hypnotics. The present study focuses on the reasons why hypnotic benzodiazepines are used for protracted lengths of time. By examining patient experiences and cognitions, a deeper understanding may be gained of why patients continue to use benzodiazepines. Elderly, long-term users of benzodiazepine hypnotics were interviewed using a semi-structured interview procedure. A comparison group of non-users of the drugs were given a brief interview to collect comparative data. Interview data were analysed from transcripts using qualitative methodology; statistical comparisons between the groups were made using non-parametric statistics. The long-term users had significantly fewer hours of sleep per night than the non-users. There was some evidence of tolerance and a suggestion that symptoms of withdrawal were maintaining continual use. None of the long-term users had clean knowledge of what their doctors thought of their use of benzodiazepines. The data suggest that the power of the doctor may not be utilized to its full potential in the prevention of long-term use, that at least 50% of elderly benzodiazepine users would like to discontinue use, and that patients need information and advice on how to discontinue these drugs.
Building integral projection models: a user's guide.
Rees, Mark; Childs, Dylan Z; Ellner, Stephen P
2014-05-01
In order to understand how changes in individual performance (growth, survival or reproduction) influence population dynamics and evolution, ecologists are increasingly using parameterized mathematical models. For continuously structured populations, where some continuous measure of individual state influences growth, survival or reproduction, integral projection models (IPMs) are commonly used. We provide a detailed description of the steps involved in constructing an IPM, explaining how to: (i) translate your study system into an IPM; (ii) implement your IPM; and (iii) diagnose potential problems with your IPM. We emphasize how the study organism's life cycle, and the timing of censuses, together determine the structure of the IPM kernel and important aspects of the statistical analysis used to parameterize an IPM using data on marked individuals. An IPM based on population studies of Soay sheep is used to illustrate the complete process of constructing, implementing and evaluating an IPM fitted to sample data. We then look at very general approaches to parameterizing an IPM, using a wide range of statistical techniques (e.g. maximum likelihood methods, generalized additive models, nonparametric kernel density estimators). Methods for selecting models for parameterizing IPMs are briefly discussed. We conclude with key recommendations and a brief overview of applications that extend the basic model. The online Supporting Information provides commented R code for all our analyses. © 2014 The Authors. Journal of Animal Ecology published by John Wiley & Sons Ltd on behalf of British Ecological Society.
Applications of quantum entropy to statistics
NASA Astrophysics Data System (ADS)
Silver, R. N.; Martz, H. F.
This paper develops two generalizations of the maximum entropy (ME) principle. First, Shannon classical entropy is replaced by von Neumann quantum entropy to yield a broader class of information divergences (or penalty functions) for statistics applications. Negative relative quantum entropy enforces convexity, positivity, non-local extensivity and prior correlations such as smoothness. This enables the extension of ME methods from their traditional domain of ill-posed in-verse problems to new applications such as non-parametric density estimation. Second, given a choice of information divergence, a combination of ME and Bayes rule is used to assign both prior and posterior probabilities. Hyperparameters are interpreted as Lagrange multipliers enforcing constraints. Conservation principles are proposed to act statistical regularization and other hyperparameters, such as conservation of information and smoothness. ME provides an alternative to hierarchical Bayes methods.
Adolescent cigarette smoking and health risk behavior.
Busen, N H; Modeland, V; Kouzekanani, K
2001-06-01
During the past 30 years, tobacco use among adolescents has substantially increased, resulting in major health problems associated with tobacco consumption. The purpose of this study was to identify adolescent smoking behaviors and to determine the relationship among smoking, specific demographic variables, and health risk behaviors. The sample consisted of 93 self-selecting adolescents. An ex post facto design was used for this study and data were analyzed by using nonparametric statistics. Findings included a statistically significant relationship between lifetime cigarette use and ethnicity. Statistically significant relationships were also found among current cigarette use and ethnicity, alcohol use, marijuana use, suicidal thoughts, and age at first sexual intercourse. Nurses and other providers must recognize that cigarette smoking may indicate other risk behaviors common among adolescents. Copyright 2001 by W.B. Saunders Company
Zornoza-Moreno, Matilde; Fuentes-Hernández, Silvia; Sánchez-Solis, Manuel; Rol, María Ángeles; Larqué, Elvira; Madrid, Juan Antonio
2011-05-01
The authors developed a method useful for home measurement of temperature, activity, and sleep rhythms in infants under normal-living conditions during their first 6 mos of life. In addition, parametric and nonparametric tests for assessing circadian system maturation in these infants were compared. Anthropometric parameters plus ankle skin temperature and activity were evaluated in 10 infants by means of two data loggers, Termochron iButton (DS1291H, Maxim Integrated Products, Sunnyvale, CA) for temperature and HOBO Pendant G (Hobo Pendant G Acceleration, UA-004-64, Onset Computer Corporation, Bourne, MA) for motor activity, located in special baby socks specifically designed for the study. Skin temperature and motor activity were recorded over 3 consecutive days at 15 days, 1, 3, and 6 mos of age. Circadian rhythms of skin temperature and motor activity appeared at 3 mos in most babies. Mean skin temperature decreased significantly by 3 mos of life relative to previous measurements (p = .0001), whereas mean activity continued to increase during the first 6 mos. For most of the parameters analyzed, statistically significant changes occurred at 3-6 mos relative to 0.5-1 mo of age. Major differences were found using nonparametric tests. Intradaily variability in motor activity decreased significantly at 6 mos of age relative to previous measurements, and followed a similar trend for temperature; interdaily stability increased significantly at 6 mos of age relative to previous measurements for both variables; relative amplitude increased significantly at 6 mos for temperature and at 3 mos for activity, both with respect to previous measurements. A high degree of correlation was found between chronobiological parametric and nonparametric tests for mean and mesor and also for relative amplitude versus the cosinor-derived amplitude. However, the correlation between parametric and nonparametric equivalent indices (acrophase and midpoint of M5, interdaily stability and Rayleigh test, or intradaily variability and P(1)/P(ultradian)) despite being significant, was lower for both temperature and activity. The circadian function index (CFI index), based on the integrated variable temperature-activity, increased gradually with age and was statistically significant at 6 mos of age. At 6 mos, 90% of the infants' rest period coincided with the standard sleep period of their parents, defined from 23:00 to 07:00 h (dichotomic index I < O; when I < O = 100%, there is a complete coincidence between infant nocturnal rest period and the standard rest period), whereas at 15 days of life the coincidence was only 75%. The combination of thermometry and actimetry using data loggers placed in infants' socks is a reliable method for assessing both variables and also sleep rhythms in infants under ambulatory conditions, with minimal disturbance. Using this methodological approach, circadian rhythms of skin temperature and motor activity appeared by 3 mos in most babies. Nonparametric tests provided more reliable information than cosinor analysis for circadian rhythm assessment in infants.
Dwivedi, Alok Kumar; Mallawaarachchi, Indika; Alvarado, Luis A
2017-06-30
Experimental studies in biomedical research frequently pose analytical problems related to small sample size. In such studies, there are conflicting findings regarding the choice of parametric and nonparametric analysis, especially with non-normal data. In such instances, some methodologists questioned the validity of parametric tests and suggested nonparametric tests. In contrast, other methodologists found nonparametric tests to be too conservative and less powerful and thus preferred using parametric tests. Some researchers have recommended using a bootstrap test; however, this method also has small sample size limitation. We used a pooled method in nonparametric bootstrap test that may overcome the problem related with small samples in hypothesis testing. The present study compared nonparametric bootstrap test with pooled resampling method corresponding to parametric, nonparametric, and permutation tests through extensive simulations under various conditions and using real data examples. The nonparametric pooled bootstrap t-test provided equal or greater power for comparing two means as compared with unpaired t-test, Welch t-test, Wilcoxon rank sum test, and permutation test while maintaining type I error probability for any conditions except for Cauchy and extreme variable lognormal distributions. In such cases, we suggest using an exact Wilcoxon rank sum test. Nonparametric bootstrap paired t-test also provided better performance than other alternatives. Nonparametric bootstrap test provided benefit over exact Kruskal-Wallis test. We suggest using nonparametric bootstrap test with pooled resampling method for comparing paired or unpaired means and for validating the one way analysis of variance test results for non-normal data in small sample size studies. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Dugué, Audrey Emmanuelle; Pulido, Marina; Chabaud, Sylvie; Belin, Lisa; Gal, Jocelyn
2016-12-01
We describe how to estimate progression-free survival while dealing with interval-censored data in the setting of clinical trials in oncology. Three procedures with SAS and R statistical software are described: one allowing for a nonparametric maximum likelihood estimation of the survival curve using the EM-ICM (Expectation and Maximization-Iterative Convex Minorant) algorithm as described by Wellner and Zhan in 1997; a sensitivity analysis procedure in which the progression time is assigned (i) at the midpoint, (ii) at the upper limit (reflecting the standard analysis when the progression time is assigned at the first radiologic exam showing progressive disease), or (iii) at the lower limit of the censoring interval; and finally, two multiple imputations are described considering a uniform or the nonparametric maximum likelihood estimation (NPMLE) distribution. Clin Cancer Res; 22(23); 5629-35. ©2016 AACR. ©2016 American Association for Cancer Research.
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-01-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models. PMID:23275882
Pérez-Rodríguez, Paulino; Gianola, Daniel; González-Camacho, Juan Manuel; Crossa, José; Manès, Yann; Dreisigacker, Susanne
2012-12-01
In genome-enabled prediction, parametric, semi-parametric, and non-parametric regression models have been used. This study assessed the predictive ability of linear and non-linear models using dense molecular markers. The linear models were linear on marker effects and included the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B. The non-linear models (this refers to non-linearity on markers) were reproducing kernel Hilbert space (RKHS) regression, Bayesian regularized neural networks (BRNN), and radial basis function neural networks (RBFNN). These statistical models were compared using 306 elite wheat lines from CIMMYT genotyped with 1717 diversity array technology (DArT) markers and two traits, days to heading (DTH) and grain yield (GY), measured in each of 12 environments. It was found that the three non-linear models had better overall prediction accuracy than the linear regression specification. Results showed a consistent superiority of RKHS and RBFNN over the Bayesian LASSO, Bayesian ridge regression, Bayes A, and Bayes B models.
Deriving health utilities from the MacNew Heart Disease Quality of Life Questionnaire.
Chen, Gang; McKie, John; Khan, Munir A; Richardson, Jeff R
2015-10-01
Quality of life is included in the economic evaluation of health services by measuring the preference for health states, i.e. health state utilities. However, most intervention studies include a disease-specific, not a utility, instrument. Consequently, there has been increasing use of statistical mapping algorithms which permit utilities to be estimated from a disease-specific instrument. The present paper provides such algorithms between the MacNew Heart Disease Quality of Life Questionnaire (MacNew) instrument and six multi-attribute utility (MAU) instruments, the Euroqol (EQ-5D), the Short Form 6D (SF-6D), the Health Utilities Index (HUI) 3, the Quality of Wellbeing (QWB), the 15D (15 Dimension) and the Assessment of Quality of Life (AQoL-8D). Heart disease patients and members of the healthy public were recruited from six countries. Non-parametric rank tests were used to compare subgroup utilities and MacNew scores. Mapping algorithms were estimated using three separate statistical techniques. Mapping algorithms achieved a high degree of precision. Based on the mean absolute error and the intra class correlation the preferred mapping is MacNew into SF-6D or 15D. Using the R squared statistic the preferred mapping is MacNew into AQoL-8D. The algorithms reported in this paper enable MacNew data to be mapped into utilities predicted from any of six instruments. This permits studies which have included the MacNew to be used in cost utility analyses which, in turn, allows the comparison of services with interventions across the health system. © The European Society of Cardiology 2014.
NASA Astrophysics Data System (ADS)
Desrini, Sufi; Ghiffary, Hifzhan Maulana
2018-04-01
Muntingia calabura L., also known locally as Talok or Kersen, is a plant which has been widely used as traditional medicine in Indonesia. In this study, we evaluated the antibacterial activity of Muntingia calabura L. Leaves ethanolic and n-hexane extract extract on Propionibacterium acnes. Antibacterial activity was determined in the extracts using agar well diffusion method. The antibacterial activities of each extract (2 mg/mL, 8 mg/ml, 20 mg/mL 30 mg/mL, and 40 mg/mL) were tested against to Propionibacterium acnes. Zone of inhibition of ethanolic extract and n-hexane extract was measured, compared, and analyzed by using a statistical programme. The phytochemical analyses of the plants were carried out using thin chromatography layer (TLC). The average diameter zone of inhibition at the concentration of 2 mg/mL of the ethanolic extract is 9,97 mm while n-Hexane extract at the same concentration showed 0 mm. Statistical test used was non-parametric test using Kruskal Wallis test which was continued to the Mann-Whitney to see the magnitude of the difference between concentration among groups. Kruskal-Wallis test revealed a significant value 0,000. Based on the result of Post Hoc test using Mann - Whitney test, there is the statistically significant difference between each concentration of ethanolic extract and n-hexane as well as positive control group (p-value < 0,05). Both extracts have antibacterial activity on P.acne. However, ethanolic extract of Muntingia calabura L. is better in inhibiting Propionibacterium acnes growth than n-hexane extract.
Nayak, Gurudutt; Singh, Inderpreet; Shetty, Shashit; Dahiya, Surya
2014-05-01
Apical extrusion of debris and irrigants during cleaning and shaping of the root canal is one of the main causes of periapical inflammation and postoperative flare-ups. The purpose of this study was to quantitatively measure the amount of debris and irrigants extruded apically in single rooted canals using two reciprocating and one rotary single file nickel-titanium instrumentation systems. Sixty human mandibular premolars, randomly assigned to three groups (n = 20) were instrumented using two reciprocating (Reciproc and Wave One) and one rotary (One Shape) single-file nickel-titanium systems. Bidistilled water was used as irrigant with traditional needle irrigation delivery system. Eppendorf tubes were used as test apparatus for collection of debris and irrigant. The volume of extruded irrigant was collected and quantified via 0.1-mL increment measure supplied on the disposable plastic insulin syringe. The liquid inside the tubes was dried and the mean weight of debris was assessed using an electronic microbalance. The data were statistically analysed using Kruskal-Wallis nonparametric test and Mann Whitney U test with Bonferroni adjustment. P-values less than 0.05 were considered significant. The Reciproc file system produced significantly more debris compared with OneShape file system (P<0.05), but no statistically significant difference was obtained between the two reciprocating instruments (P>0.05). Extrusion of irrigant was statistically insignificant irrespective of the instrument or instrumentation technique used (P >0.05). Although all systems caused apical extrusion of debris and irrigant, continuous rotary instrumentation was associated with less extrusion as compared with the use of reciprocating file systems.
Detecting trends in raptor counts: power and type I error rates of various statistical tests
Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.
1996-01-01
We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.
A Nonparametric Statistical Approach to the Validation of Computer Simulation Models
1985-11-01
Ballistic Research Laboratory, the Experimental Design and Analysis Branch of the Systems Engineering and Concepts Analysis Division was funded to...2 Winter. E M. Wisemiler. D P. azd UjiharmJ K. Venrgcation ad Validatiot of Engineering Simulatiots with Minimal D2ta." Pmeedinr’ of the 1976 Summer...used by numerous authors. Law%6 has augmented their approach with specific suggestions for each of the three stage’s: 1. develop high face-validity
Neural network representation and learning of mappings and their derivatives
NASA Technical Reports Server (NTRS)
White, Halbert; Hornik, Kurt; Stinchcombe, Maxwell; Gallant, A. Ronald
1991-01-01
Discussed here are recent theorems proving that artificial neural networks are capable of approximating an arbitrary mapping and its derivatives as accurately as desired. This fact forms the basis for further results establishing the learnability of the desired approximations, using results from non-parametric statistics. These results have potential applications in robotics, chaotic dynamics, control, and sensitivity analysis. An example involving learning the transfer function and its derivatives for a chaotic map is discussed.
Dong, Qi; Elliott, Michael R; Raghunathan, Trivellore E
2014-06-01
Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs.
Dong, Qi; Elliott, Michael R.; Raghunathan, Trivellore E.
2017-01-01
Outside of the survey sampling literature, samples are often assumed to be generated by a simple random sampling process that produces independent and identically distributed (IID) samples. Many statistical methods are developed largely in this IID world. Application of these methods to data from complex sample surveys without making allowance for the survey design features can lead to erroneous inferences. Hence, much time and effort have been devoted to develop the statistical methods to analyze complex survey data and account for the sample design. This issue is particularly important when generating synthetic populations using finite population Bayesian inference, as is often done in missing data or disclosure risk settings, or when combining data from multiple surveys. By extending previous work in finite population Bayesian bootstrap literature, we propose a method to generate synthetic populations from a posterior predictive distribution in a fashion inverts the complex sampling design features and generates simple random samples from a superpopulation point of view, making adjustment on the complex data so that they can be analyzed as simple random samples. We consider a simulation study with a stratified, clustered unequal-probability of selection sample design, and use the proposed nonparametric method to generate synthetic populations for the 2006 National Health Interview Survey (NHIS), and the Medical Expenditure Panel Survey (MEPS), which are stratified, clustered unequal-probability of selection sample designs. PMID:29200608
Kernel-based whole-genome prediction of complex traits: a review.
Morota, Gota; Gianola, Daniel
2014-01-01
Prediction of genetic values has been a focus of applied quantitative genetics since the beginning of the 20th century, with renewed interest following the advent of the era of whole genome-enabled prediction. Opportunities offered by the emergence of high-dimensional genomic data fueled by post-Sanger sequencing technologies, especially molecular markers, have driven researchers to extend Ronald Fisher and Sewall Wright's models to confront new challenges. In particular, kernel methods are gaining consideration as a regression method of choice for genome-enabled prediction. Complex traits are presumably influenced by many genomic regions working in concert with others (clearly so when considering pathways), thus generating interactions. Motivated by this view, a growing number of statistical approaches based on kernels attempt to capture non-additive effects, either parametrically or non-parametrically. This review centers on whole-genome regression using kernel methods applied to a wide range of quantitative traits of agricultural importance in animals and plants. We discuss various kernel-based approaches tailored to capturing total genetic variation, with the aim of arriving at an enhanced predictive performance in the light of available genome annotation information. Connections between prediction machines born in animal breeding, statistics, and machine learning are revisited, and their empirical prediction performance is discussed. Overall, while some encouraging results have been obtained with non-parametric kernels, recovering non-additive genetic variation in a validation dataset remains a challenge in quantitative genetics.
Cabrieto, Jedelyn; Tuerlinckx, Francis; Kuppens, Peter; Grassmann, Mariel; Ceulemans, Eva
2017-06-01
Change point detection in multivariate time series is a complex task since next to the mean, the correlation structure of the monitored variables may also alter when change occurs. DeCon was recently developed to detect such changes in mean and\\or correlation by combining a moving windows approach and robust PCA. However, in the literature, several other methods have been proposed that employ other non-parametric tools: E-divisive, Multirank, and KCP. Since these methods use different statistical approaches, two issues need to be tackled. First, applied researchers may find it hard to appraise the differences between the methods. Second, a direct comparison of the relative performance of all these methods for capturing change points signaling correlation changes is still lacking. Therefore, we present the basic principles behind DeCon, E-divisive, Multirank, and KCP and the corresponding algorithms, to make them more accessible to readers. We further compared their performance through extensive simulations using the settings of Bulteel et al. (Biological Psychology, 98 (1), 29-42, 2014) implying changes in mean and in correlation structure and those of Matteson and James (Journal of the American Statistical Association, 109 (505), 334-345, 2014) implying different numbers of (noise) variables. KCP emerged as the best method in almost all settings. However, in case of more than two noise variables, only DeCon performed adequately in detecting correlation changes.
Observed changes in relative humidity and dew point temperature in coastal regions of Iran
NASA Astrophysics Data System (ADS)
Hosseinzadeh Talaee, P.; Sabziparvar, A. A.; Tabari, Hossein
2012-12-01
The analysis of trends in hydroclimatic parameters and assessment of their statistical significance have recently received a great concern to clarify whether or not there is an obvious climate change. In the current study, parametric linear regression and nonparametric Mann-Kendall tests were applied for detecting annual and seasonal trends in the relative humidity (RH) and dew point temperature ( T dew) time series at ten coastal weather stations in Iran during 1966-2005. The serial structure of the data was considered, and the significant serial correlations were eliminated using the trend-free pre-whitening method. The results showed that annual RH increased by 1.03 and 0.28 %/decade at the northern and southern coastal regions of the country, respectively, while annual T dew increased by 0.29 and 0.15°C per decade at the northern and southern regions, respectively. The significant trends were frequent in the T dew series, but they were observed only at 2 out of the 50 RH series. The results showed that the difference between the results of the parametric and nonparametric tests was small, although the parametric test detected larger significant trends in the RH and T dew time series. Furthermore, the differences between the results of the trend tests were not related to the normality of the statistical distribution.
Medical literature searches: a comparison of PubMed and Google Scholar.
Nourbakhsh, Eva; Nugent, Rebecca; Wang, Helen; Cevik, Cihan; Nugent, Kenneth
2012-09-01
Medical literature searches provide critical information for clinicians. However, the best strategy for identifying relevant high-quality literature is unknown. We compared search results using PubMed and Google Scholar on four clinical questions and analysed these results with respect to article relevance and quality. Abstracts from the first 20 citations for each search were classified into three relevance categories. We used the weighted kappa statistic to analyse reviewer agreement and nonparametric rank tests to compare the number of citations for each article and the corresponding journals' impact factors. Reviewers ranked 67.6% of PubMed articles and 80% of Google Scholar articles as at least possibly relevant (P = 0.116) with high agreement (all kappa P-values < 0.01). Google Scholar articles had a higher median number of citations (34 vs. 1.5, P < 0.0001) and came from higher impact factor journals (5.17 vs. 3.55, P = 0.036). PubMed searches and Google Scholar searches often identify different articles. In this study, Google Scholar articles were more likely to be classified as relevant, had higher numbers of citations and were published in higher impact factor journals. The identification of frequently cited articles using Google Scholar for searches probably has value for initial literature searches. © 2012 The authors. Health Information and Libraries Journal © 2012 Health Libraries Group.
Plante, David T; Landsness, Eric C; Peterson, Michael J; Goldstein, Michael R; Riedner, Brady A; Wanger, Timothy; Guokas, Jeffrey J; Tononi, Giulio; Benca, Ruth M
2012-09-18
Sleep disturbance plays an important role in major depressive disorder (MDD). Prior investigations have demonstrated that slow wave activity (SWA) during sleep is altered in MDD; however, results have not been consistent across studies, which may be due in part to sex-related differences in SWA and/or limited spatial resolution of spectral analyses. This study sought to characterize SWA in MDD utilizing high-density electroencephalography (hdEEG) to examine the topography of SWA across the cortex in MDD, as well as sex-related variation in SWA topography in the disorder. All-night recordings with 256 channel hdEEG were collected in 30 unipolar MDD subjects (19 women) and 30 age and sex-matched control subjects. Spectral analyses of SWA were performed to determine group differences. SWA was compared between MDD and controls, including analyses stratified by sex, using statistical non-parametric mapping to correct for multiple comparisons of topographic data. As a group, MDD subjects demonstrated significant increases in all-night SWA primarily in bilateral prefrontal channels. When stratified by sex, MDD women demonstrated global increases in SWA relative to age-matched controls that were most consistent in bilateral prefrontal regions; however, MDD men showed no significant differences relative to age-matched controls. Further analyses demonstrated increased SWA in MDD women was most prominent in the first portion of the night. Women, but not men with MDD demonstrate significant increases in SWA in multiple cortical areas relative to control subjects. Further research is warranted to investigate the role of SWA in MDD, and to clarify how increased SWA in women with MDD is related to the pathophysiology of the disorder.
Rodríguez-Entrena, Macario; Schuberth, Florian; Gelhard, Carsten
2018-01-01
Structural equation modeling using partial least squares (PLS-SEM) has become a main-stream modeling approach in various disciplines. Nevertheless, prior literature still lacks a practical guidance on how to properly test for differences between parameter estimates. Whereas existing techniques such as parametric and non-parametric approaches in PLS multi-group analysis solely allow to assess differences between parameters that are estimated for different subpopulations, the study at hand introduces a technique that allows to also assess whether two parameter estimates that are derived from the same sample are statistically different. To illustrate this advancement to PLS-SEM, we particularly refer to a reduced version of the well-established technology acceptance model.
Statistical methods for astronomical data with upper limits. II - Correlation and regression
NASA Technical Reports Server (NTRS)
Isobe, T.; Feigelson, E. D.; Nelson, P. I.
1986-01-01
Statistical methods for calculating correlations and regressions in bivariate censored data where the dependent variable can have upper or lower limits are presented. Cox's regression and the generalization of Kendall's rank correlation coefficient provide significant levels of correlations, and the EM algorithm, under the assumption of normally distributed errors, and its nonparametric analog using the Kaplan-Meier estimator, give estimates for the slope of a regression line. Monte Carlo simulations demonstrate that survival analysis is reliable in determining correlations between luminosities at different bands. Survival analysis is applied to CO emission in infrared galaxies, X-ray emission in radio galaxies, H-alpha emission in cooling cluster cores, and radio emission in Seyfert galaxies.
Examination of influential observations in penalized spline regression
NASA Astrophysics Data System (ADS)
Türkan, Semra
2013-10-01
In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.
Chiu, Chun-Huo; Wang, Yi-Ting; Walther, Bruno A; Chao, Anne
2014-09-01
It is difficult to accurately estimate species richness if there are many almost undetectable species in a hyper-diverse community. Practically, an accurate lower bound for species richness is preferable to an inaccurate point estimator. The traditional nonparametric lower bound developed by Chao (1984, Scandinavian Journal of Statistics 11, 265-270) for individual-based abundance data uses only the information on the rarest species (the numbers of singletons and doubletons) to estimate the number of undetected species in samples. Applying a modified Good-Turing frequency formula, we derive an approximate formula for the first-order bias of this traditional lower bound. The approximate bias is estimated by using additional information (namely, the numbers of tripletons and quadrupletons). This approximate bias can be corrected, and an improved lower bound is thus obtained. The proposed lower bound is nonparametric in the sense that it is universally valid for any species abundance distribution. A similar type of improved lower bound can be derived for incidence data. We test our proposed lower bounds on simulated data sets generated from various species abundance models. Simulation results show that the proposed lower bounds always reduce bias over the traditional lower bounds and improve accuracy (as measured by mean squared error) when the heterogeneity of species abundances is relatively high. We also apply the proposed new lower bounds to real data for illustration and for comparisons with previously developed estimators. © 2014, The International Biometric Society.
Rights, Jason D; Sterba, Sonya K
2016-11-01
Multilevel data structures are common in the social sciences. Often, such nested data are analysed with multilevel models (MLMs) in which heterogeneity between clusters is modelled by continuously distributed random intercepts and/or slopes. Alternatively, the non-parametric multilevel regression mixture model (NPMM) can accommodate the same nested data structures through discrete latent class variation. The purpose of this article is to delineate analytic relationships between NPMM and MLM parameters that are useful for understanding the indirect interpretation of the NPMM as a non-parametric approximation of the MLM, with relaxed distributional assumptions. We define how seven standard and non-standard MLM specifications can be indirectly approximated by particular NPMM specifications. We provide formulas showing how the NPMM can serve as an approximation of the MLM in terms of intraclass correlation, random coefficient means and (co)variances, heteroscedasticity of residuals at level 1, and heteroscedasticity of residuals at level 2. Further, we discuss how these relationships can be useful in practice. The specific relationships are illustrated with simulated graphical demonstrations, and direct and indirect interpretations of NPMM classes are contrasted. We provide an R function to aid in implementing and visualizing an indirect interpretation of NPMM classes. An empirical example is presented and future directions are discussed. © 2016 The British Psychological Society.
A Comparison of Japan and U.K. SF-6D Health-State Valuations Using a Non-Parametric Bayesian Method.
Kharroubi, Samer A
2015-08-01
There is interest in the extent to which valuations of health may differ between different countries and cultures, but few studies have compared preference values of health states obtained in different countries. We sought to estimate and compare two directly elicited valuations for SF-6D health states between the Japan and U.K. general adult populations using Bayesian methods. We analysed data from two SF-6D valuation studies where, using similar standard gamble protocols, values for 241 and 249 states were elicited from representative samples of the Japan and U.K. general adult populations, respectively. We estimate a function applicable across both countries that explicitly accounts for the differences between them, and is estimated using data from both countries. The results suggest that differences in SF-6D health-state valuations between the Japan and U.K. general populations are potentially important. The magnitude of these country-specific differences in health-state valuation depended, however, in a complex way on the levels of individual dimensions. The new Bayesian non-parametric method is a powerful approach for analysing data from multiple nationalities or ethnic groups, to understand the differences between them and potentially to estimate the underlying utility functions more efficiently.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Advanced statistical methods for improved data analysis of NASA astrophysics missions
NASA Technical Reports Server (NTRS)
Feigelson, Eric D.
1992-01-01
The investigators under this grant studied ways to improve the statistical analysis of astronomical data. They looked at existing techniques, the development of new techniques, and the production and distribution of specialized software to the astronomical community. Abstracts of nine papers that were produced are included, as well as brief descriptions of four software packages. The articles that are abstracted discuss analytical and Monte Carlo comparisons of six different linear least squares fits, a (second) paper on linear regression in astronomy, two reviews of public domain software for the astronomer, subsample and half-sample methods for estimating sampling distributions, a nonparametric estimation of survival functions under dependent competing risks, censoring in astronomical data due to nondetections, an astronomy survival analysis computer package called ASURV, and improving the statistical methodology of astronomical data analysis.
Does bad inference drive out good?
Marozzi, Marco
2015-07-01
The (mis)use of statistics in practice is widely debated, and a field where the debate is particularly active is medicine. Many scholars emphasize that a large proportion of published medical research contains statistical errors. It has been noted that top class journals like Nature Medicine and The New England Journal of Medicine publish a considerable proportion of papers that contain statistical errors and poorly document the application of statistical methods. This paper joins the debate on the (mis)use of statistics in the medical literature. Even though the validation process of a statistical result may be quite elusive, a careful assessment of underlying assumptions is central in medicine as well as in other fields where a statistical method is applied. Unfortunately, a careful assessment of underlying assumptions is missing in many papers, including those published in top class journals. In this paper, it is shown that nonparametric methods are good alternatives to parametric methods when the assumptions for the latter ones are not satisfied. A key point to solve the problem of the misuse of statistics in the medical literature is that all journals have their own statisticians to review the statistical method/analysis section in each submitted paper. © 2015 Wiley Publishing Asia Pty Ltd.
Comparison of Salmonella enteritidis phage types isolated from layers and humans in Belgium in 2005.
Welby, Sarah; Imberechts, Hein; Riocreux, Flavien; Bertrand, Sophie; Dierick, Katelijne; Wildemauwe, Christa; Hooyberghs, Jozef; Van der Stede, Yves
2011-08-01
The aim of this study was to investigate the available results for Belgium of the European Union coordinated monitoring program (2004/665 EC) on Salmonella in layers in 2005, as well as the results of the monthly outbreak reports of Salmonella Enteritidis in humans in 2005 to identify a possible statistical significant trend in both populations. Separate descriptive statistics and univariate analysis were carried out and the parametric and/or non-parametric hypothesis tests were conducted. A time cluster analysis was performed for all Salmonella Enteritidis phage types (PTs) isolated. The proportions of each Salmonella Enteritidis PT in layers and in humans were compared and the monthly distribution of the most common PT, isolated in both populations, was evaluated. The time cluster analysis revealed significant clusters during the months May and June for layers and May, July, August, and September for humans. PT21, the most frequently isolated PT in both populations in 2005, seemed to be responsible of these significant clusters. PT4 was the second most frequently isolated PT. No significant difference was found for the monthly trend evolution of both PT in both populations based on parametric and non-parametric methods. A similar monthly trend of PT distribution in humans and layers during the year 2005 was observed. The time cluster analysis and the statistical significance testing confirmed these results. Moreover, the time cluster analysis showed significant clusters during the summer time and slightly delayed in time (humans after layers). These results suggest a common link between the prevalence of Salmonella Enteritidis in layers and the occurrence of the pathogen in humans. Phage typing was confirmed to be a useful tool for identifying temporal trends.
Cignini, Pietro; Giorlandino, Maurizio; Brutti, Pierpaolo; Mangiafico, Lucia; Aloisi, Alessia; Giorlandino, Claudio
2016-01-01
Objective To establish reference charts for fetal cerebellar vermis height in an unselected population. Methods A prospective cross-sectional study between September 2009 and December 2014 was carried out at ALTAMEDICA Fetal–Maternal Medical Centre, Rome, Italy. Of 25203 fetal biometric measurements, 12167 (48%) measurements of the cerebellar vermis were available. After excluding 1562 (12.8%) measurements, a total of 10605 (87.2%) fetuses were considered and analyzed once only. Parametric and nonparametric quantile regression models were used for the statistical analysis. In order to evaluate the robustness of the proposed reference charts regarding various distributional assumptions on the ultrasound measurements at hand, we compared the gestational age-specific reference curves we produced through the statistical methods used. Normal mean height based on parametric and nonparametric methods were defined for each week of gestation and the regression equation expressing the height of the cerebellar vermis as a function of gestational age was calculated. Finally the correlation between dimension/gestation was measured. Results The mean height of the cerebellar vermis was 12.7mm (SD, 1.6mm; 95% confidence interval, 12.7–12.8mm). The regression equation expressing the height of the CV as a function of the gestational age was: height (mm) = -4.85+0.78 x gestational age. The correlation between dimension/gestation was expressed by the coefficient r = 0.87. Conclusion This is the first prospective cross-sectional study on fetal cerebellar vermis biometry with such a large sample size reported in literature. It is a detailed statistical survey and contains new centile-based reference charts for fetal height of cerebellar vermis measurements. PMID:26812238
On the Mean Squared Error of Nonparametric Quantile Estimators under Random Right-Censorship.
1986-09-01
SECURITY CI.ASSIFICATION lb. RESTRICTIVE MARKINGS UNCLASSIFIED 2a, SECURITY CLASSIFICATION AUTHORITY 3 . OISTRIBUTIONIAVAILASIL.ITY OF REPORT P16e 2b...UNCLASSIPIEO/UNLIMITEO 3 SAME AS RPT". 0 OTIC USERS 1 UNCLASSIFIED p." " 22. NAME OP RESPONSIBLE INOIVIOUAL 22b. TELEPHONE NUMBER 22c. OFFICE SYMBOL...in Section 3 , and the result for the kernel estimator Qn is derived in Section 4. It should be k. mentioned that the order statistic methods used by
H2(15)O or 13NH3 PET and electromagnetic tomography (LORETA) during partial status epilepticus.
Zumsteg, D; Wennberg, R A; Treyer, V; Buck, A; Wieser, H G
2005-11-22
The authors evaluated the feasibility and source localization utility of H2(15)O or 13NH3 PET and low-resolution electromagnetic tomography (LORETA) in three patients with partial status epilepticus (SE). Results were correlated with findings from intraoperative electrocorticographic recordings and surgical outcomes. PET studies of cerebral blood flow and noninvasive source modeling with LORETA using statistical nonparametric mapping provided useful information for localizing the ictal activity in patients with partial SE.
Estimation of variance in Cox's regression model with shared gamma frailties.
Andersen, P K; Klein, J P; Knudsen, K M; Tabanera y Palacios, R
1997-12-01
The Cox regression model with a shared frailty factor allows for unobserved heterogeneity or for statistical dependence between the observed survival times. Estimation in this model when the frailties are assumed to follow a gamma distribution is reviewed, and we address the problem of obtaining variance estimates for regression coefficients, frailty parameter, and cumulative baseline hazards using the observed nonparametric information matrix. A number of examples are given comparing this approach with fully parametric inference in models with piecewise constant baseline hazards.
2014-10-02
defined by Eqs. (3)–(4) (Greenwell & Finch , 2004) (Kar & Mohanty, 2006). The p value provides the metric for novelty scoring. p = QKS(z) = 2 ∞∑ j=1 (−1...provides early detection of degradation and ability to score its significance in order to inform maintenance planning and consequently reduce disruption ...actionable information, sig- nals are typically processed from raw measurements into a reduced dimension novelty summary value that may be more easily
Binquet, C; Abrahamowicz, M; Mahboubi, A; Jooste, V; Faivre, J; Bonithon-Kopp, C; Quantin, C
2008-12-30
Flexible survival models, which avoid assumptions about hazards proportionality (PH) or linearity of continuous covariates effects, bring the issues of model selection to a new level of complexity. Each 'candidate covariate' requires inter-dependent decisions regarding (i) its inclusion in the model, and representation of its effects on the log hazard as (ii) either constant over time or time-dependent (TD) and, for continuous covariates, (iii) either loglinear or non-loglinear (NL). Moreover, 'optimal' decisions for one covariate depend on the decisions regarding others. Thus, some efficient model-building strategy is necessary.We carried out an empirical study of the impact of the model selection strategy on the estimates obtained in flexible multivariable survival analyses of prognostic factors for mortality in 273 gastric cancer patients. We used 10 different strategies to select alternative multivariable parametric as well as spline-based models, allowing flexible modeling of non-parametric (TD and/or NL) effects. We employed 5-fold cross-validation to compare the predictive ability of alternative models.All flexible models indicated significant non-linearity and changes over time in the effect of age at diagnosis. Conventional 'parametric' models suggested the lack of period effect, whereas more flexible strategies indicated a significant NL effect. Cross-validation confirmed that flexible models predicted better mortality. The resulting differences in the 'final model' selected by various strategies had also impact on the risk prediction for individual subjects.Overall, our analyses underline (a) the importance of accounting for significant non-parametric effects of covariates and (b) the need for developing accurate model selection strategies for flexible survival analyses. Copyright 2008 John Wiley & Sons, Ltd.
Lal, Devyani; Keim, Paul; Delisle, Josie; Barker, Bridget; Rank, Matthew A; Chia, Nicholas; Schupp, James M; Gillece, John D; Cope, Emily K
2017-06-01
The role of microbiota in sinonasal inflammation can be further understood by targeted sampling of healthy and diseased subjects. We compared the microbiota of the middle meatus (MM) and inferior meatus (IM) in healthy, allergic rhinitis (AR), and chronic rhinosinusitis (CRS) subjects to characterize intrasubject, intersubject, and intergroup differences. Subjects were recruited in the office, and characterized into healthy, AR, and CRS groups. Endoscopically-guided swab samples were obtained from the MM and IM bilaterally. Bacterial microbiota were characterized by sequencing the V3-V4 region of the 16S ribosomal RNA (rRNA) gene. Intersubject microbiome analyses were conducted in 65 subjects: 8 healthy, 11 AR, and 46 CRS (25 CRS with nasal polyps [CRSwNP]; 21 CRS without nasal polyps [CRSsNP]). Intrasubject analyses were conducted for 48 individuals (4 controls, 11 AR, 8 CRSwNP, and 15 CRSwNP). There was considerable intersubject microbiota variability, but intrasubject profiles were similar (p = 0.001, nonparametric t test). Intrasubject bacterial diversity was significantly reduced in MM of CRSsNP subjects compared to IM samples (p = 0.022, nonparametric t test). CRSsNP MM samples were enriched in Streptococcus, Haemophilus, and Fusobacterium spp. but exhibited loss of diversity compared to healthy, CRSwNP, and AR subject-samples (p < 0.05; nonparametric t test). CRSwNP patients were enriched in Staphylococcus, Alloiococcus, and Corynebacterium spp. This study presents the sinonasal microbiome profile in one of the larger populations of non-CRS and CRS subjects, and is the first office-based cohort in the literature. In contrast to healthy, AR, and CRSwNP subjects, CRSsNP MM samples exhibited decreased microbiome diversity and anaerobic enrichment. CRSsNP MM samples had reduced diversity compared to same-subject IM samples, a novel finding. © 2017 ARS-AAOA, LLC.
Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend; Dørup, Jens; Lous, Jørgen
2016-01-01
We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also associated with the parents' and day care teachers' reports on the children's vocabulary, sentence construction, and pronunciation. Type B tympanograms were associated with a mean hearing 5-6dB below that of than type A, C1, or C2. In the graphic analysis, Galker scores were closely and significantly related to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding of spoken words in noise, and it correlated well with the day care teachers' reports and less with the parents' reports. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Touw, D J; Vinks, A A; Neef, C
1997-06-01
The availability of personal computer programs for individualizing drug dosage regimens has stimulated the interest in modelling population pharmacokinetics. Data from 82 adolescent and adult patients with cystic fibrosis (CF) who were treated with intravenous tobramycin because of an exacerbation of their pulmonary infection were analysed with a non-parametric expectation maximization (NPEM) algorithm. This algorithm estimates the entire discrete joint probability density of the pharmacokinetic parameters. It also provides traditional parametric statistics such as the means, standard deviation, median, covariances and correlations among the various parameters. It also provides graphic-2- and 3-dimensional representations of the marginal densities of the parameters investigated. Several models for intravenous tobramycin in adolescent and adult patients with CF were compared. Covariates were total body weight (for the volume of distribution) and creatinine clearance (for the total body clearance and elimination rate). Because of lack of data on patients with poor renal function, restricted models with non-renal clearance and the non-renal elimination rate constant fixed at literature values of 0.15 L/h and 0.01 h-1 were also included. In this population, intravenous tobramycin could be best described by median (+/-dispersion factor) volume of distribution per unit of total body weight of 0.28 +/- 0.05 L/kg, elimination rate constant of 0.25 +/- 0.10 h-1 and elimination rate constant per unit of creatinine clearance of 0.0008 +/- 0.0009 h-1/(ml/min/1.73 m2). Analysis of populations of increasing size showed that using a restricted model with a non-renal elimination rate constant fixed at 0.01 h-1, a model based on a population of only 10 to 20 patients, contained parameter values similar to those of the entire population and, using the full model, a larger population (at least 40 patients) was needed.
Gender Wage Disparities among the Highly Educated.
Black, Dan A; Haviland, Amelia; Sanders, Seth G; Taylor, Lowell J
2008-01-01
In the U.S. college-educated women earn approximately 30 percent less than their non-Hispanic white male counterparts. We conduct an empirical examination of this wage disparity for four groups of women-non-Hispanic white, black, Hispanic, and Asian-using the National Survey of College Graduates, a large data set that provides unusually detailed information on higher-level education. Nonparametric matching analysis indicates that among men and women who speak English at home, between 44 and 73 percent of the gender wage gaps are accounted for by such pre-market factors as highest degree and major. When we restrict attention further to women who have "high labor force attachment" (i.e., work experience that is similar to male comparables) we account for 54 to 99 percent of gender wage gaps. Our nonparametric approach differs from familiar regression-based decompositions, so for the sake of comparison we conduct parametric analyses as well. Inferences drawn from these latter decompositions can be quite misleading.
NASA Astrophysics Data System (ADS)
Houssein, Hend A. A.; Jaafar, M. S.; Ramli, R. M.; Ismail, N. E.; Ahmad, A. L.; Bermakai, M. Y.
2010-07-01
In this study, the subpopulations of human blood parameters including lymphocytes, the mid-cell fractions (eosinophils, basophils, and monocytes), and granulocytes were determined by electronic sizing in the Health Centre of Universiti Sains Malaysia. These parameters have been correlated with human blood characteristics such as age, gender, ethnicity, and blood types; before and after irradiation with 0.95 mW He-Ne laser (λ = 632.8 nm). The correlations were obtained by finding patterns, paired non-parametric tests, and an independent non-parametric tests using the SPSS version 11.5, centroid and peak positions, and flux variations. The findings show that the centroid and peak positions, flux peak and total flux, were very much correlated and can become a significant indicator for blood analyses. Furthermore, the encircled flux analysis demonstrated a good future prospect in blood research, thus leading the way as a vibrant diagnosis tool to clarify diseases associated with blood.
Gender Wage Disparities among the Highly Educated
Black, Dan A.; Haviland, Amelia; Sanders, Seth G.; Taylor, Lowell J.
2015-01-01
In the U.S. college-educated women earn approximately 30 percent less than their non-Hispanic white male counterparts. We conduct an empirical examination of this wage disparity for four groups of women—non-Hispanic white, black, Hispanic, and Asian—using the National Survey of College Graduates, a large data set that provides unusually detailed information on higher-level education. Nonparametric matching analysis indicates that among men and women who speak English at home, between 44 and 73 percent of the gender wage gaps are accounted for by such pre-market factors as highest degree and major. When we restrict attention further to women who have “high labor force attachment” (i.e., work experience that is similar to male comparables) we account for 54 to 99 percent of gender wage gaps. Our nonparametric approach differs from familiar regression-based decompositions, so for the sake of comparison we conduct parametric analyses as well. Inferences drawn from these latter decompositions can be quite misleading. PMID:26097255
DNN-state identification of 2D distributed parameter systems
NASA Astrophysics Data System (ADS)
Chairez, I.; Fuentes, R.; Poznyak, A.; Poznyak, T.; Escudero, M.; Viana, L.
2012-02-01
There are many examples in science and engineering which are reduced to a set of partial differential equations (PDEs) through a process of mathematical modelling. Nevertheless there exist many sources of uncertainties around the aforementioned mathematical representation. Moreover, to find exact solutions of those PDEs is not a trivial task especially if the PDE is described in two or more dimensions. It is well known that neural networks can approximate a large set of continuous functions defined on a compact set to an arbitrary accuracy. In this article, a strategy based on the differential neural network (DNN) for the non-parametric identification of a mathematical model described by a class of two-dimensional (2D) PDEs is proposed. The adaptive laws for weights ensure the 'practical stability' of the DNN-trajectories to the parabolic 2D-PDE states. To verify the qualitative behaviour of the suggested methodology, here a non-parametric modelling problem for a distributed parameter plant is analysed.
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS).
Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M; Khan, Ajmal
2016-01-01
This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher's exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher's exact test, logistic regression, epidemiological statistics, and non-parametric tests. This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design.
Developing appropriate methods for cost-effectiveness analysis of cluster randomized trials.
Gomes, Manuel; Ng, Edmond S-W; Grieve, Richard; Nixon, Richard; Carpenter, James; Thompson, Simon G
2012-01-01
Cost-effectiveness analyses (CEAs) may use data from cluster randomized trials (CRTs), where the unit of randomization is the cluster, not the individual. However, most studies use analytical methods that ignore clustering. This article compares alternative statistical methods for accommodating clustering in CEAs of CRTs. Our simulation study compared the performance of statistical methods for CEAs of CRTs with 2 treatment arms. The study considered a method that ignored clustering--seemingly unrelated regression (SUR) without a robust standard error (SE)--and 4 methods that recognized clustering--SUR and generalized estimating equations (GEEs), both with robust SE, a "2-stage" nonparametric bootstrap (TSB) with shrinkage correction, and a multilevel model (MLM). The base case assumed CRTs with moderate numbers of balanced clusters (20 per arm) and normally distributed costs. Other scenarios included CRTs with few clusters, imbalanced cluster sizes, and skewed costs. Performance was reported as bias, root mean squared error (rMSE), and confidence interval (CI) coverage for estimating incremental net benefits (INBs). We also compared the methods in a case study. Each method reported low levels of bias. Without the robust SE, SUR gave poor CI coverage (base case: 0.89 v. nominal level: 0.95). The MLM and TSB performed well in each scenario (CI coverage, 0.92-0.95). With few clusters, the GEE and SUR (with robust SE) had coverage below 0.90. In the case study, the mean INBs were similar across all methods, but ignoring clustering underestimated statistical uncertainty and the value of further research. MLMs and the TSB are appropriate analytical methods for CEAs of CRTs with the characteristics described. SUR and GEE are not recommended for studies with few clusters.
Nayak, Gurudutt; Singh, Inderpreet; Shetty, Shashit; Dahiya, Surya
2014-01-01
Objective: Apical extrusion of debris and irrigants during cleaning and shaping of the root canal is one of the main causes of periapical inflammation and postoperative flare-ups. The purpose of this study was to quantitatively measure the amount of debris and irrigants extruded apically in single rooted canals using two reciprocating and one rotary single file nickel-titanium instrumentation systems. Materials and Methods: Sixty human mandibular premolars, randomly assigned to three groups (n = 20) were instrumented using two reciprocating (Reciproc and Wave One) and one rotary (One Shape) single-file nickel-titanium systems. Bidistilled water was used as irrigant with traditional needle irrigation delivery system. Eppendorf tubes were used as test apparatus for collection of debris and irrigant. The volume of extruded irrigant was collected and quantified via 0.1-mL increment measure supplied on the disposable plastic insulin syringe. The liquid inside the tubes was dried and the mean weight of debris was assessed using an electronic microbalance. The data were statistically analysed using Kruskal-Wallis nonparametric test and Mann Whitney U test with Bonferroni adjustment. P-values less than 0.05 were considered significant. Results: The Reciproc file system produced significantly more debris compared with OneShape file system (P<0.05), but no statistically significant difference was obtained between the two reciprocating instruments (P>0.05). Extrusion of irrigant was statistically insignificant irrespective of the instrument or instrumentation technique used (P >0.05). Conclusions: Although all systems caused apical extrusion of debris and irrigant, continuous rotary instrumentation was associated with less extrusion as compared with the use of reciprocating file systems. PMID:25628665
Beloeil, Helene; Slim, Karem
2018-02-15
Sustainability of ERP is a challenge and data are scarce on the subject. The aim of this study was to assess if application of enhanced recovery elements through the Francophone Group of Enhanced Recovery after Surgery (Grace) in the anaesthesia management was sustainable 2 years after its implementation. We conducted a retrospective analysis of the prospective Grace database between October 2014 and October 2016. The evolution of each recommendation item over time was analysed using non-parametric Spearman correlation coefficient. A total of 67 and 43 centres corresponding to 2067 and 3022 patients participated to the Grace audit in colorectal and orthopaedics surgery, respectively. Colorectal surgery: Mean length of stay was 5 (±4) days and readmission rate was 6.6%. Application of most items did not statistically change. It worsened over time for PONV prophylaxis (P=0.01) and prevention of intraoperative hypothermia (P=0.02); and improved for NSAID administration (P=0.01). Orthopaedics surgery: Mean length of stay was 3 (±2) days and readmission rate was 1.7%. There was a trend towards improvement for most items. It reached statistical significance for PONV prophylaxis (P=0.001), limited preoperative fasting (P=0.01). While the use of a perineural catheter (P=0.001) decreased over time, infiltration of the surgical site statistically increased (P=0.05). This study shows on a large scale a trend towards less application of all ERP items over time. Continuous audits should be encouraged to expect further improvements. Copyright © 2018 Société française d'anesthésie et de réanimation (Sfar). Published by Elsevier Masson SAS. All rights reserved.
Bergamo, Ana Zn; Nelson-Filho, Paulo; Romano, Fábio L; da Silva, Raquel Ab; Saraiva, Maria Cp; da Silva, Lea Ab; Matsumoto, Mirian An
2016-12-01
The aim of this study was to evaluate the alterations on plaque index (PI), gingival index (GI), gingival bleeding index (GBI), and gingival crevicular fluid (GCF) volume after use of three different brackets types for 60 days. Setting Participants: The sample comprised 20 patients of both sexes aged 11-15 years (mean age: 13.3 years), with permanent dentition, adequate oral hygiene, and mild tooth crowding, overjet, and overbite. A conventional metallic bracket Gemini™, and two different brands of self-ligating brackets - In-Ovation ® R and SmartClip™ - were bonded to the maxillary incisors and canines. PI, GI, GBI scores, and GCF volume were measured before and 30 and 60 days after bonding of the brackets. Data were analysed statistically using non-parametric tests coefficient at a 5% significance level. There was no statistically significant correlation (P > 0.05) between tooth crowding, overjet, and overbite and the PI, GI, GBI scores, and GCF volume before bonding, indicating no influence of malocclusion on the clinical parameters. Regardless of the bracket design, no statistically significant difference (P > 0.05) was found for GI, GBI scores. PI and GCF volume showed a significant difference among the brackets in different periods. In pairwise comparisons a significant difference was observed when compared before with 60 days after bonding, for the teeth bonded with SmartClip™ self-ligating bracket, (PI P = 0.009; GCF volume P = 0.001). There was an increase in PI score and GCF volume 60 days after bonding of SmartClip™ self-ligating brackets, indicating the influence of bracket design on these clinical parameters.
Lindholm, C; Gustavsson, A; Jönsson, L; Wimo, A
2013-05-01
Because the prevalence of many brain disorders rises with age, and brain disorders are costly, the economic burden of brain disorders will increase markedly during the next decades. The purpose of this study is to analyze how the costs to society vary with different levels of functioning and with the presence of a brain disorder. Resource utilization and costs from a societal viewpoint were analyzed versus cognition, activities of daily living (ADL), instrumental activities of daily living (IADL), brain disorder diagnosis and age in a population-based cohort of people aged 65 years and older in Nordanstig in Northern Sweden. Descriptive statistics, non-parametric bootstrapping and a generalized linear model (GLM) were used for the statistical analyses. Most people were zero users of care. Societal costs of dementia were by far the highest, ranging from SEK 262,000 (mild) to SEK 519,000 per year (severe dementia). In univariate analysis, all measures of functioning were significantly related to costs. When controlling for ADL and IADL in the multivariate GLM, cognition did not have a statistically significant effect on total cost. The presence of a brain disorder did not impact total cost when controlling for function. The greatest shift in costs was seen when comparing no dependency in ADL and dependency in one basic ADL function. It is the level of functioning, rather than the presence of a brain disorder diagnosis, which predicts costs. ADLs are better explanatory variables of costs than Mini mental state examination. Most people in a population-based cohort are zero users of care. Copyright © 2012 John Wiley & Sons, Ltd.
Developing Appropriate Methods for Cost-Effectiveness Analysis of Cluster Randomized Trials
Gomes, Manuel; Ng, Edmond S.-W.; Nixon, Richard; Carpenter, James; Thompson, Simon G.
2012-01-01
Aim. Cost-effectiveness analyses (CEAs) may use data from cluster randomized trials (CRTs), where the unit of randomization is the cluster, not the individual. However, most studies use analytical methods that ignore clustering. This article compares alternative statistical methods for accommodating clustering in CEAs of CRTs. Methods. Our simulation study compared the performance of statistical methods for CEAs of CRTs with 2 treatment arms. The study considered a method that ignored clustering—seemingly unrelated regression (SUR) without a robust standard error (SE)—and 4 methods that recognized clustering—SUR and generalized estimating equations (GEEs), both with robust SE, a “2-stage” nonparametric bootstrap (TSB) with shrinkage correction, and a multilevel model (MLM). The base case assumed CRTs with moderate numbers of balanced clusters (20 per arm) and normally distributed costs. Other scenarios included CRTs with few clusters, imbalanced cluster sizes, and skewed costs. Performance was reported as bias, root mean squared error (rMSE), and confidence interval (CI) coverage for estimating incremental net benefits (INBs). We also compared the methods in a case study. Results. Each method reported low levels of bias. Without the robust SE, SUR gave poor CI coverage (base case: 0.89 v. nominal level: 0.95). The MLM and TSB performed well in each scenario (CI coverage, 0.92–0.95). With few clusters, the GEE and SUR (with robust SE) had coverage below 0.90. In the case study, the mean INBs were similar across all methods, but ignoring clustering underestimated statistical uncertainty and the value of further research. Conclusions. MLMs and the TSB are appropriate analytical methods for CEAs of CRTs with the characteristics described. SUR and GEE are not recommended for studies with few clusters. PMID:22016450
How to Evaluate Phase Differences between Trial Groups in Ongoing Electrophysiological Signals
VanRullen, Rufin
2016-01-01
A growing number of studies endeavor to reveal periodicities in sensory and cognitive functions, by comparing the distribution of ongoing (pre-stimulus) oscillatory phases between two (or more) trial groups reflecting distinct experimental outcomes. A systematic relation between the phase of spontaneous electrophysiological signals, before a stimulus is even presented, and the eventual result of sensory or cognitive processing for that stimulus, would be indicative of an intrinsic periodicity in the underlying neural process. Prior studies of phase-dependent perception have used a variety of analytical methods to measure and evaluate phase differences, and there is currently no established standard practice in this field. The present report intends to remediate this need, by systematically comparing the statistical power of various measures of “phase opposition” between two trial groups, in a number of real and simulated experimental situations. Seven measures were evaluated: one parametric test (circular Watson-Williams test), and three distinct measures of phase opposition (phase bifurcation index, phase opposition sum, and phase opposition product) combined with two procedures for non-parametric statistical testing (permutation, or a combination of z-score and permutation). While these are obviously not the only existing or conceivable measures, they have all been used in recent studies. All tested methods performed adequately on a previously published dataset (Busch et al., 2009). On a variety of artificially constructed datasets, no single measure was found to surpass all others, but instead the suitability of each measure was contingent on several experimental factors: the time, frequency, and depth of oscillatory phase modulation; the absolute and relative amplitudes of post-stimulus event-related potentials for the two trial groups; the absolute and relative trial numbers for the two groups; and the number of permutations used for non-parametric testing. The concurrent use of two phase opposition measures, the parametric Watson-Williams test and a non-parametric test based on summing inter-trial coherence values for the two trial groups, appears to provide the most satisfactory outcome in all situations tested. Matlab code is provided to automatically compute these phase opposition measures. PMID:27683543
Design of a sediment data-collection program in Kansas as affected by time trends
Jordan, P.R.
1985-01-01
Data collection programs need to be re-examined periodically in order to insure their usefulness, efficiency, and applicability. The possibility of time trends in sediment concentration, in particular, makes the examination with new statistical techniques desirable. After adjusting sediment concentrations for their relation to streamflow rates and by using a seasonal adaptation of Kendall 's nonparametric statistical test, time trends of flow-adjusted concentrations were detected for 11 of the 38 sediment records tested that were not affected by large reservoirs. Ten of the 11 trends were toward smaller concentrations; only 1 was toward larger concentrations. Of the apparent trends that were not statistically significant (0.05 level) using data available, nearly all were toward smaller concentrations. Because the reason for the lack of statistical significance of an apparent trend may be inadequacy of data rather than absence of trend and because of the prevalence of apparent trends in one direction, the assumption was made that a time trend may be present at any station. This assumption can significantly affect the design of a sediment data collection program. Sudden decreases (step trends) in flow-adjusted sediment concentrations were found at all stations that were short distances downstream from large reservoirs and that had adequate data for a seasonal adaptation of Wilcoxon 's nonparametric statistical test. Examination of sediment records in the 1984 data collection program of the Kansas Water Office indicated 13 stations that can be discontinued temporarily because data are now adequate. Data collection could be resumed in 1992 when new data may be needed because of possible time trends. New data are needed at eight previously operated stations where existing data may be inadequate or misleading because of time trends. Operational changes may be needed at some stations, such as hiring contract observers or installing automatic pumping samplers. Implementing the changes in the program can provide a substantial increase in the quantity of useful information on stream sediment for the same funding as the 1984 level. (Author 's abstract)
Violations of Gutenberg-Richter Relation in Anthropogenic Seismicity
NASA Astrophysics Data System (ADS)
Urban, Pawel; Lasocki, Stanislaw; Blascheck, Patrick; do Nascimento, Aderson Farias; Van Giang, Nguyen; Kwiatek, Grzegorz
2016-05-01
Anthropogenic seismicity (AS) is the undesired dynamic rockmass response to technological processes. AS environments are shallow hence their heterogeneities have important impact on AS. Moreover, AS is controlled by complex and changeable technological factors. This complicated origin of AS explains why models used in tectonic seismicity may be not suitable for AS. We study here four cases of AS, testing statistically whether the magnitudes follow the Gutenberg-Richter relation or not. The considered cases include the data from Mponeng gold mine in South Africa, the data observed during stimulation of geothermal well Basel 1 in Switzerland, the data from Acu water reservoir region in Brazil and the data from Song Tranh 2 hydropower plant region in Vietnam. The cases differ in inducing technologies, in the duration of periods in which they were recorded, and in the ranges of magnitudes. In all four cases the observed frequency-magnitude distributions statistically significantly differ from the Gutenberg-Richter relation. Although in all cases the Gutenberg-Richter b value changed in time, this factor turns out to be not responsible for the discovered deviations from the Gutenberg-Richter-born exponential distribution model. Though the deviations from Gutenberg-Richter law are not big, they substantially diminish the accuracy of assessment of seismic hazard parameters. It is demonstrated that the use of non-parametric kernel estimators of magnitude distribution functions improves significantly the accuracy of hazard estimates and, therefore, these estimators are recommended to be used in probabilistic analyses of seismic hazard caused by AS.
King, Christopher R
2016-11-01
To date neither the optimal radiotherapy dose nor the existence of a dose-response has been established for salvage RT (SRT). A systematic review from 1996 to 2015 and meta-analysis was performed to identify the pathologic, clinical and treatment factors associated with relapse-free survival (RFS) after SRT (uniformly defined as a PSA>0.2ng/mL or rising above post-SRT nadir). A sigmoidal dose-response curve was objectively fitted and a non-parametric statistical test used to determine significance. 71 studies (10,034 patients) satisfied the meta-analysis criteria. SRT dose (p=0.0001), PSA prior to SRT (p=0.0009), ECE+ (p=0.039) and SV+ (p=0.046) had significant associations with RFS. Statistical analyses confirmed the independence of SRT dose-response. Omission of series with ADT did not alter results. Dose-response is well fit by a sigmoidal curve (p=0.0001) with a TCD 50 of 65.8Gy, with a dose of 70Gy achieving 58.4% RFS vs. 38.5% for 60Gy. A 2.0% [95% CI 1.1-3.2] improvement in RFS is achieved for each Gy. The SRT dose-response remarkably parallels that for definitive RT of localized disease. This study provides level 2a evidence for dose-escalated SRT>70Gy. The presence of an SRT dose-response for microscopic disease supports the hypothesis that prostate cancer is inherently radio-resistant. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
McGinty, S M; Cicero, M C; Cicero, J M; Schultz-Janney, L; Williams-Shipman, K L
2001-06-01
In 1997, only 22% of licensed physical therapists living in California were members of the American Physical Therapy Association (APTA). This 1998 study was designed to identify the reason(s) why most licensed physical therapists in California choose not to belong to their profession's national association and to examine the demographics of nonmembers. The subjects were a random sample of 400 California licensed physical therapists who were not members of APTA. The survey instrument included a demographic questionnaire and statements for response using a 5-point Likert-type scale. Frequency distributions were calculated for responses and demographic data. Nonparametric analyses were used to determine statistical significance. Chi-square analysis was used to compare responses to statements by gender and by full-time versus part-time work status. Spearman rank correlation coefficients were used to determine any relationships between demographic data (eg, gender and work status). The Mann-Whitney U test was used to determine any differences in responses to specific representation questions by those respondents who worked in those environments. All statistical tests were 2-tailed tests conducted at the P(.05 level, unless otherwise indicated. Means, standard deviations, and ranges were used where appropriate. There was a 67% response rate. Eighty-nine percent of the respondents had been members of APTA. Eighty-eight percent of the respondents believed that APTA national dues were too high, and 90% thought California Chapter dues were too high. Cost was the primary reason given for APTA nonmembership in California.
Gea-Caballero, Vicente; Castro-Sánchez, Enrique; Júarez-Vela, Raúl; Díaz-Herrera, Miguel Ángel; de Miguel-Montoya, Isabel; Martínez-Riera, José Ramón
Nursing work environments are key determinants of care quality. Our study aimed to evaluate the characteristics of nursing environments in primary care settings in the Canary Islands, and identify crucial components of such environments to improve quality. We conducted a cross-sectional study in primary care organisations using the Practice Environment Scale - Nursing Work Index tool. We collected sociodemographic variables, scores, and selected the essential items conducive to optimal care. Appropriate parametric and non-parametric statistical tests were used to analyse relations between variables (CI = 95%, error = 5%). One hundred and forty-four nurses participated. The mean total score was 81.6. The results for the five dimensions included in the Practice Environment Scale - Nursing Work Index ranged from 2.25 - 2.92 (Mean). Twelve key items for quality of care were selected; six were positive in the Canary Islands, two were mixed, and four negative. 7/12 items were included in Dimension 2 (fundamentals of nursing). Being a manager was statistically associated with higher scores (p<.000). Years of experience was inversely associated with scores in the 12 items (p<.021). Nursing work environments in primary care settings in the Canary Islands are comparable to others previously studied in Spain. Areas to improve were human resources and participation of nurses in management decisions. Nurse managers must be knowledgeable about their working environments so they can focus on improvements in key dimensions. Copyright © 2017 Elsevier España, S.L.U. All rights reserved.
Seroprevalence of retrovirus in North American captive macropodidae.
Georoff, Timothy A; Joyner, Priscilla H; Hoover, John P; Payton, Mark E; Pogranichniy, Roman M
2008-09-01
Laboratory records of serology results from captive macropodidae sampled between 1997 and 2005 were reviewed to assess the seroprevalence of retrovirus exposure. Serum samples from 269 individuals (136 males, 133 females) representing 10 species of macropods housed in 31 North American captive collections were analyzed for retrovirus antibody using an indirect immunofluorescent assay. The prevalence of positive antibody titers comparing male versus female, between species, between age groups, and among animals with identified parentage was examined by nonparametric statistical analyses. Median age of animals at time of sample collection was 36 mo (range 2-201 mo). Total percentage seropositive was 20.4%. Serum antibody was detected in 31 of 47 (66.0%) tammar wallaby (Macropus eugenii), nine of 24 (37.5%) yellow-footed rock wallaby (Petrogale xanthopus), four of 11 (36.4%) swamp wallaby (Wallabia bicolor), 10 of 80 (12.5%) red-necked wallaby (Macropus rufogriseus), and one of 54 (1.9%) parma wallaby (Macropus parma). No individuals of western gray kangaroo (n=3) (Macropus fuliginosus), eastern gray kangaroo (n=19) (Macropus giganteus), common wallaroo (n=6) (Macropus robustus), red kangaroo (n=11) (Macropus rufus), or Matschie's tree kangaroo (n=14) (Dendrolagus matschiei) were positive for retrovirus antibody. These results demonstrate that five species of captive macropods have a history of exposure to retrovirus, with the highest percentage seropositive and highest statistical correlation in M. eugenii (pair-wise Fisher's exact test, alpha = 0.05). Additionally, one wild-caught M. eugenii was confirmed seropositive during quarantine period, indicating that retrovirus exposure may exist in wild populations.
VARIABLE SELECTION IN NONPARAMETRIC ADDITIVE MODELS
Huang, Jian; Horowitz, Joel L.; Wei, Fengrong
2010-01-01
We consider a nonparametric additive model of a conditional mean function in which the number of variables and additive components may be larger than the sample size but the number of nonzero additive components is “small” relative to the sample size. The statistical problem is to determine which additive components are nonzero. The additive components are approximated by truncated series expansions with B-spline bases. With this approximation, the problem of component selection becomes that of selecting the groups of coefficients in the expansion. We apply the adaptive group Lasso to select nonzero components, using the group Lasso to obtain an initial estimator and reduce the dimension of the problem. We give conditions under which the group Lasso selects a model whose number of components is comparable with the underlying model, and the adaptive group Lasso selects the nonzero components correctly with probability approaching one as the sample size increases and achieves the optimal rate of convergence. The results of Monte Carlo experiments show that the adaptive group Lasso procedure works well with samples of moderate size. A data example is used to illustrate the application of the proposed method. PMID:21127739
Tan, Ziwen; Qin, Guoyou; Zhou, Haibo
2016-01-01
Outcome-dependent sampling (ODS) designs have been well recognized as a cost-effective way to enhance study efficiency in both statistical literature and biomedical and epidemiologic studies. A partially linear additive model (PLAM) is widely applied in real problems because it allows for a flexible specification of the dependence of the response on some covariates in a linear fashion and other covariates in a nonlinear non-parametric fashion. Motivated by an epidemiological study investigating the effect of prenatal polychlorinated biphenyls exposure on children's intelligence quotient (IQ) at age 7 years, we propose a PLAM in this article to investigate a more flexible non-parametric inference on the relationships among the response and covariates under the ODS scheme. We propose the estimation method and establish the asymptotic properties of the proposed estimator. Simulation studies are conducted to show the improved efficiency of the proposed ODS estimator for PLAM compared with that from a traditional simple random sampling design with the same sample size. The data of the above-mentioned study is analyzed to illustrate the proposed method. PMID:27006375
A Non-parametric Cutout Index for Robust Evaluation of Identified Proteins*
Serang, Oliver; Paulo, Joao; Steen, Hanno; Steen, Judith A.
2013-01-01
This paper proposes a novel, automated method for evaluating sets of proteins identified using mass spectrometry. The remaining peptide-spectrum match score distributions of protein sets are compared to an empirical absent peptide-spectrum match score distribution, and a Bayesian non-parametric method reminiscent of the Dirichlet process is presented to accurately perform this comparison. Thus, for a given protein set, the process computes the likelihood that the proteins identified are correctly identified. First, the method is used to evaluate protein sets chosen using different protein-level false discovery rate (FDR) thresholds, assigning each protein set a likelihood. The protein set assigned the highest likelihood is used to choose a non-arbitrary protein-level FDR threshold. Because the method can be used to evaluate any protein identification strategy (and is not limited to mere comparisons of different FDR thresholds), we subsequently use the method to compare and evaluate multiple simple methods for merging peptide evidence over replicate experiments. The general statistical approach can be applied to other types of data (e.g. RNA sequencing) and generalizes to multivariate problems. PMID:23292186
Multi-object segmentation using coupled nonparametric shape and relative pose priors
NASA Astrophysics Data System (ADS)
Uzunbas, Mustafa Gökhan; Soldea, Octavian; Çetin, Müjdat; Ünal, Gözde; Erçil, Aytül; Unay, Devrim; Ekin, Ahmet; Firat, Zeynep
2009-02-01
We present a new method for multi-object segmentation in a maximum a posteriori estimation framework. Our method is motivated by the observation that neighboring or coupling objects in images generate configurations and co-dependencies which could potentially aid in segmentation if properly exploited. Our approach employs coupled shape and inter-shape pose priors that are computed using training images in a nonparametric multi-variate kernel density estimation framework. The coupled shape prior is obtained by estimating the joint shape distribution of multiple objects and the inter-shape pose priors are modeled via standard moments. Based on such statistical models, we formulate an optimization problem for segmentation, which we solve by an algorithm based on active contours. Our technique provides significant improvements in the segmentation of weakly contrasted objects in a number of applications. In particular for medical image analysis, we use our method to extract brain Basal Ganglia structures, which are members of a complex multi-object system posing a challenging segmentation problem. We also apply our technique to the problem of handwritten character segmentation. Finally, we use our method to segment cars in urban scenes.
Uncertainty in determining extreme precipitation thresholds
NASA Astrophysics Data System (ADS)
Liu, Bingjun; Chen, Junfan; Chen, Xiaohong; Lian, Yanqing; Wu, Lili
2013-10-01
Extreme precipitation events are rare and occur mostly on a relatively small and local scale, which makes it difficult to set the thresholds for extreme precipitations in a large basin. Based on the long term daily precipitation data from 62 observation stations in the Pearl River Basin, this study has assessed the applicability of the non-parametric, parametric, and the detrended fluctuation analysis (DFA) methods in determining extreme precipitation threshold (EPT) and the certainty to EPTs from each method. Analyses from this study show the non-parametric absolute critical value method is easy to use, but unable to reflect the difference of spatial rainfall distribution. The non-parametric percentile method can account for the spatial distribution feature of precipitation, but the problem with this method is that the threshold value is sensitive to the size of rainfall data series and is subjected to the selection of a percentile thus make it difficult to determine reasonable threshold values for a large basin. The parametric method can provide the most apt description of extreme precipitations by fitting extreme precipitation distributions with probability distribution functions; however, selections of probability distribution functions, the goodness-of-fit tests, and the size of the rainfall data series can greatly affect the fitting accuracy. In contrast to the non-parametric and the parametric methods which are unable to provide information for EPTs with certainty, the DFA method although involving complicated computational processes has proven to be the most appropriate method that is able to provide a unique set of EPTs for a large basin with uneven spatio-temporal precipitation distribution. The consistency between the spatial distribution of DFA-based thresholds with the annual average precipitation, the coefficient of variation (CV), and the coefficient of skewness (CS) for the daily precipitation further proves that EPTs determined by the DFA method are more reasonable and applicable for the Pearl River Basin.
Kappa statistic for clustered matched-pair data.
Yang, Zhao; Zhou, Ming
2014-07-10
Kappa statistic is widely used to assess the agreement between two procedures in the independent matched-pair data. For matched-pair data collected in clusters, on the basis of the delta method and sampling techniques, we propose a nonparametric variance estimator for the kappa statistic without within-cluster correlation structure or distributional assumptions. The results of an extensive Monte Carlo simulation study demonstrate that the proposed kappa statistic provides consistent estimation and the proposed variance estimator behaves reasonably well for at least a moderately large number of clusters (e.g., K ≥50). Compared with the variance estimator ignoring dependence within a cluster, the proposed variance estimator performs better in maintaining the nominal coverage probability when the intra-cluster correlation is fair (ρ ≥0.3), with more pronounced improvement when ρ is further increased. To illustrate the practical application of the proposed estimator, we analyze two real data examples of clustered matched-pair data. Copyright © 2014 John Wiley & Sons, Ltd.
Nonparametric entropy estimation using kernel densities.
Lake, Douglas E
2009-01-01
The entropy of experimental data from the biological and medical sciences provides additional information over summary statistics. Calculating entropy involves estimates of probability density functions, which can be effectively accomplished using kernel density methods. Kernel density estimation has been widely studied and a univariate implementation is readily available in MATLAB. The traditional definition of Shannon entropy is part of a larger family of statistics, called Renyi entropy, which are useful in applications that require a measure of the Gaussianity of data. Of particular note is the quadratic entropy which is related to the Friedman-Tukey (FT) index, a widely used measure in the statistical community. One application where quadratic entropy is very useful is the detection of abnormal cardiac rhythms, such as atrial fibrillation (AF). Asymptotic and exact small-sample results for optimal bandwidth and kernel selection to estimate the FT index are presented and lead to improved methods for entropy estimation.
Ewertzon, M; Lützén, K; Svensson, E; Andershed, B
2010-06-01
The involvement of family members in psychiatric care is important for the recovery of persons with psychotic disorders and subsequently reduces the burden on the family. Earlier qualitative studies suggest that the participation of family members can be limited by how they experience the professionals' approach, which suggests a connection to the concept of alienation. Thus, the aim of this study was in a national sample investigate family members' experiences of the psychiatric health care professionals' approach. Data were collected by the Family Involvement and Alienation Questionnaire. The median level and quartiles were used to describe the distributions and data were analysed with non-parametric statistical methods. Seventy family members of persons receiving psychiatric care participated in the study. The results indicate that a majority of the participants respond that they have experiencing a negative approach from the professionals, indicating lack of confirmation and cooperation. The results also indicate that a majority of the participants felt powerlessness and social isolation in the care being provided, indicating feelings of alienation. A significant but weak association was found between the family members' experiences of the professionals' approach and their feelings of alienation.
Shahid, S Adam; Schoenly, Kenneth; Haskell, Neal H; Hall, Robert D; Zhang, Wenjun
2003-07-01
In a test of an arthropod saturation hypothesis, we asked if the 30-yr history of carcass enrichment at the Anthropology Research Facility, Knoxville TN, has altered carcass decay rates or community structure of sarcosaprophagous arthropods, compared with three local nonenriched sites. Over a 12-d period in 1998, using pitfall traps and sweep nets, we sampled a total of 81,000 invertebrates from freshly euthanized pigs (Sus scrofa L.) placed in these sites. From this number, we sorted 69,286 forensically important (sarcosaprophagous) arthropods. The community structure of these organisms, as measured by species and individuals accumulation curves, rarefaction, and nonparametric correlation, was comparable in all four sites in taxonomic similarity, colonization rates, aerial species richness, and ranked abundances of forensically important taxa on a per carcass basis. Measures of carcass decay rate, remaining carcass weight (%) and periodic weight loss, also were similar. In most cases, carcass surface temperatures and maggot mass temperatures were also statistically indistinguishable. Probability-based results and posthoc power analyses of these variables led us to conclude that the sarcosaprophagous arthropod community of the Anthropology Research Facility is representative of surrounding sites.
Abranches, Andrea D; Soares, Fernanda V M; Junior, Saint-Clair G; Moreira, Maria Elisabeth L
2014-01-01
to analyze the changes in human milk macronutrients: fat, protein, and lactose in natural human milk (raw), frozen and thawed, after administration simulation by gavage and continuous infusion. an experimental study was performed with 34 human milk samples. The infrared spectrophotometry using the infrared analysis equipment MilkoScan Minor® (Foss, Denmark) equipment was used to analyze the macronutrients in human milk during the study phases. The analyses were performed in natural (raw) samples and after freezing and fast thawing following two steps: gavage and continuous infusion. The non-parametric Wilcoxon test for paired samples was used for the statistical analysis. the fat content was significantly reduced after administration by continuous infusion (p<0.001) during administration of both raw and thawed samples. No changes in protein and lactose content were observed between the two forms of infusion. However, the thawing process significantly increased the levels of lactose and milk protein. the route of administration by continuous infusion showed the greatest influence on fat loss among all the processes required for human milk administration. Copyright © 2014 Sociedade Brasileira de Pediatria. Published by Elsevier Editora Ltda. All rights reserved.
Lima, A H R A; Soares, A H G; Cucato, G G; Leicht, A S; Franco, F G M; Wolosker, N; Ritti-Dias, R M
2016-07-01
The aim was to investigate the association between walking capacity and HRV in patients with symptomatic peripheral artery disease (PAD). This was a cross sectional study. Ninety-five patients were recruited. Patients undertook a supine position for 20 minutes, with the final 10 minutes used to examine for resting HRV. Time domain, frequency domain, and non-linear indices were evaluated. A maximal treadmill test (Gardner protocol) was performed to assess maximal walking distance (MWD) and claudication distance (CD) in groups of PAD patients based upon their walking abilities (low, moderate, high). Differences between PAD patient groups were examined using non-parametric analyses, and Spearman rank correlations identified the relationship between MWD and CD, and HRV parameters. Symptomatic PAD patients with high MWD exhibited significantly greater HRV than patients with low MWD. Furthermore, MWD was positively associated with time domain and non-linear indices of HRV (all p < .05). However, no statistically significant correlations were observed between CD and HRV parameters or between PAD groups. A greater walking capacity is associated with better HRV in symptomatic PAD patients. Copyright © 2016 European Society for Vascular Surgery. Published by Elsevier Ltd. All rights reserved.
Job satisfaction in relation to change to all-RN staffing.
Lundgren, Solveig M; Nordholm, Lena; Segesten, Kerstin
2005-07-01
A university hospital clinic changed from a mixed to only registered nurse staffing, to reduce the staff and to encourage a philosophy of patient centred care. The aim was to maintain the same level of service and quality of care at a lower cost. The main purpose of the study was to examine job satisfaction in relation to the change from mixed to only registered nurse staffing and reduction in number of staff. Data were collected by an established questionnaire measuring job satisfaction. Non-parametric statistics were used to analyse the data. The questionnaire was distributed to 22 nurses on the ward on three occasions, covering a period of 3 years. The experience of having time to plan patient care changed during the investigation period, from 'sometimes' to 'most often having time'. Nurses with longer work experience gave more verbal information to patients and perceived less stress. Information about job performance was more important to newcomers on the ward and became less important with time. However, quite a few have had regrets over choice of work and had considered non-caring work, nevertheless the results show no significant changes in overall job satisfaction.
Soblosky, J S; Colgin, L L; Chorney-Lane, D; Davidson, J F; Carey, M E
1997-12-30
Hindlimb and forelimb deficits in rats caused by sensorimotor cortex lesions are frequently tested by using the narrow flat beam (hindlimb), the narrow pegged beam (hindlimb and forelimb) or the grid-walking (forelimb) tests. Although these are excellent tests, the narrow flat beam generates non-parametric data so that using more powerful parametric statistical analyses are prohibited. All these tests can be difficult to score if the rat is moving rapidly. Foot misplacements, especially on the grid-walking test, are indicative of an ongoing deficit, but have not been reliably and accurately described and quantified previously. In this paper we present an easy to construct and use horizontal ladder-beam with a camera system on rails which can be used to evaluate both hindlimb and forelimb deficits in a single test. By slow motion videotape playback we were able to quantify and demonstrate foot misplacements which go beyond the recovery period usually seen using more conventional measures (i.e. footslips and footfaults). This convenient system provides a rapid and reliable method for recording and evaluating rat performance on any type of beam and may be useful for measuring sensorimotor recovery following brain injury.
NASA Technical Reports Server (NTRS)
Lau, William K. M. (Technical Monitor); Bell, Thomas L.; Steiner, Matthias; Zhang, Yu; Wood, Eric F.
2002-01-01
The uncertainty of rainfall estimated from averages of discrete samples collected by a satellite is assessed using a multi-year radar data set covering a large portion of the United States. The sampling-related uncertainty of rainfall estimates is evaluated for all combinations of 100 km, 200 km, and 500 km space domains, 1 day, 5 day, and 30 day rainfall accumulations, and regular sampling time intervals of 1 h, 3 h, 6 h, 8 h, and 12 h. These extensive analyses are combined to characterize the sampling uncertainty as a function of space and time domain, sampling frequency, and rainfall characteristics by means of a simple scaling law. Moreover, it is shown that both parametric and non-parametric statistical techniques of estimating the sampling uncertainty produce comparable results. Sampling uncertainty estimates, however, do depend on the choice of technique for obtaining them. They can also vary considerably from case to case, reflecting the great variability of natural rainfall, and should therefore be expressed in probabilistic terms. Rainfall calibration errors are shown to affect comparison of results obtained by studies based on data from different climate regions and/or observation platforms.
Constructing Taxonomies to Identify Distinctive Forms of Primary Healthcare Organizations
Borgès Da Silva, Roxane; Pineault, Raynald; Hamel, Marjolaine; Levesque, Jean-Frédéric; Roberge, Danièle; Lamarche, Paul
2013-01-01
Background. Primary healthcare (PHC) renewal gives rise to important challenges for policy makers, managers, and researchers in most countries. Evaluating new emerging forms of organizations is therefore of prime importance in assessing the impact of these policies. This paper presents a set of methods related to the configurational approach and an organizational taxonomy derived from our analysis. Methods. In 2005, we carried out a study on PHC in two health and social services regions of Quebec that included urban, suburban, and rural areas. An organizational survey was conducted in 473 PHC practices. We used multidimensional nonparametric statistical methods, namely, multiple correspondence and principal component analyses, and an ascending hierarchical classification method to construct a taxonomy of organizations. Results. PHC organizations were classified into five distinct models: four professional and one community. Study findings indicate that the professional integrated coordination and the community model have great potential for organizational development since they are closest to the ideal type promoted by current reforms. Conclusion. Results showed that the configurational approach is useful to assess complex phenomena such as the organization of PHC. The analysis highlights the most promising organizational models. Our study enhances our understanding of organizational change in health services organizations. PMID:24959575
Psychiatric rehabilitation in community-based day centres: motivation and satisfaction.
Eklund, Mona; Tjörnstrand, Carina
2013-11-01
This study investigated attendees' motivation and motives for participation in day centres and their satisfaction with the rehabilitation, while also addressing the influence of day centre orientation (work- or meeting-place orientation), gender and age. Ninety-three Swedish day centre attendees participated in a cross-sectional study and completed questionnaires about motivation, motives, and satisfaction with the rehabilitation. Data were analysed with non-parametric statistics. The participants were highly motivated for going to the day centre and set clear goals for their rehabilitation. Female gender, but not age, was associated with stronger motivation. The strongest motives for going to the day centre were getting structure to the day and socializing. Attendees at work-oriented day centres more often expressed that they went there to get structure to the day and gain social status. Satisfaction with the rehabilitation was high, and the most common wishes for further opportunities concerned earning money and learning new things. The rehabilitation largely seemed to meet the attendees' needs, but the findings indicated that further developments were desired, such as participation in work on the open market and more work-like occupations in the day centre, accompanied by some kind of remuneration.
Haveraaen, L; Brouwers, E P M; Sveen, U; Skarpaas, L S; Sagvaag, H; Aas, R W
2017-12-01
Background and objective Despite large activity worldwide in building and implementing new return-to-work (RTW) services, few studies have focused on how such implementation processes develop. The aim of this study was to examine the development in patient and service characteristics the first six years of implementing a RTW service for persons with acquired brain injury (ABI). Methods The study was designed as a cohort study (n=189). Data were collected by questionnaires, filled out by the service providers. The material was divided into, and analyzed with, two implementation phases. Non-parametrical statistical methods and hierarchical regression analyses were applied on the material. Results The number of patients increased significantly, and the patient group became more homogeneous. Both the duration of the service, and the number of consultations and group session days were significantly reduced. Conclusion The patient group became more homogenous, but also significantly larger during the first six years of building the RTW service. At the same time, the duration of the service decreased. This study therefore questions if there is a lack of consensus on the intensity of work rehabilitation for this group.
Empowerment and occupational engagement among people with psychiatric disabilities.
Hultqvist, Jenny; Eklund, Mona; Leufstadius, Christel
2015-01-01
Empowerment is essential in the rehabilitation process for people with psychiatric disabilities and knowledge about factors that may play a key role within this process would be valuable for further development of the day centre services. The present study investigates day centre attendees' perceptions of empowerment. The aim was to investigate which factors show the strongest relationships to empowerment when considering occupational engagement, client satisfaction with day centres, and health-related and socio-demographic factors as correlates. 123 Swedish day centre attendees participated in a cross-sectional study by completing questionnaires regarding empowerment and the targeted correlates. Data were analysed with non-parametric statistics. Empowerment was shown to be significantly correlated with occupational engagement and client satisfaction and also with self-rated health and symptoms rated by a research assistant. The strongest indicator for belonging to the group with the highest ratings on empowerment was self-rated health, followed by occupational engagement and symptom severity. Occupational engagement added to the beneficial influence of self-rated health on empowerment. Enabling occupational engagement in meaningful activities and providing occupations that can generate client satisfaction is an important focus for day centres in order to assist the attendees' rehabilitation process so that it promotes empowerment.
Bergh, Anne-Louise; Karlsson, Jan; Persson, Eva; Friberg, Febe
2012-09-01
To describe nurses' perceptions of conditions for patient education, focusing on organisational, environmental and professional cooperation aspects, and to determine any differences between primary, municipal and hospital care. Although patient education is an important part of daily nursing practice, the conditions for this work are unclear and require clarification. A stratified random sample of 701 (83%) nurses working in primary, municipal and hospital care completed a 60-item questionnaire. The study is part of a larger project. The study items relating to organisation, environment and professional cooperation were analysed using descriptive statistics, non-parametric tests and content analysis. Conditions for patient education differ. Nurses in primary care had better conditions and more managerial support, for example in the allocation of undisturbed time. Conditions related to organisation, environment and cooperation need to be developed further. In this process, managerial support is important, and nurses must ask for better conditions in order to carry through patient education. Managerial support for the development of visible patient education routines (e.g. allocation of time, place and guidelines) is required. One recommendation is to designate a person to oversee educational work. © 2012 Blackwell Publishing Ltd.
Using a Planetarium Software Program to Promote Conceptual Change with Young Children
NASA Astrophysics Data System (ADS)
Hobson, Sally M.; Trundle, Kathy Cabe; Saçkes, Mesut
2010-04-01
This study explored young children’s understandings of targeted lunar concepts, including when the moon can be observed, observable lunar phase shapes, predictable lunar patterns, and the cause of lunar phases. Twenty-one children (ages 7-9 years) from a multi-aged, self-contained classroom participated in this study. The instructional intervention included lunar data gathering, recording, and sharing, which integrated Starry Night planetarium software and an inquiry-based instruction on moon phases. Data were gathered using semi-structured interviews, student drawings, and a card sorting activity before and after instruction. Students’ lunar calendars and written responses, participant observer field notes, and videotaped class sessions also provided data throughout the study. Data were analyzed using constant comparative analysis. Nonparametric statistical analyses were also performed to support the qualitative findings. Results reflected a positive change in children’s conceptual understanding of all targeted concepts including the cause of moon phases, which is remarkable considering the complexity and abstractness of this spatial task. Results provided evidence that computer simulations may reduce the burden on children’s cognitive capacity and facilitate their learning of complex scientific concepts that would not be possible to learn on their own.
NASA Astrophysics Data System (ADS)
Protasov, Konstantin T.; Pushkareva, Tatyana Y.; Artamonov, Evgeny S.
2002-02-01
The problem of cloud field recognition from the NOAA satellite data is urgent for solving not only meteorological problems but also for resource-ecological monitoring of the Earth's underlying surface associated with the detection of thunderstorm clouds, estimation of the liquid water content of clouds and the moisture of the soil, the degree of fire hazard, etc. To solve these problems, we used the AVHRR/NOAA video data that regularly displayed the situation in the territory. The complexity and extremely nonstationary character of problems to be solved call for the use of information of all spectral channels, mathematical apparatus of testing statistical hypotheses, and methods of pattern recognition and identification of the informative parameters. For a class of detection and pattern recognition problems, the average risk functional is a natural criterion for the quality and the information content of the synthesized decision rules. In this case, to solve efficiently the problem of identifying cloud field types, the informative parameters must be determined by minimization of this functional. Since the conditional probability density functions, representing mathematical models of stochastic patterns, are unknown, the problem of nonparametric reconstruction of distributions from the leaning samples arises. To this end, we used nonparametric estimates of distributions with the modified Epanechnikov kernel. The unknown parameters of these distributions were determined by minimization of the risk functional, which for the learning sample was substituted by the empirical risk. After the conditional probability density functions had been reconstructed for the examined hypotheses, a cloudiness type was identified using the Bayes decision rule.
Location tests for biomarker studies: a comparison using simulations for the two-sample case.
Scheinhardt, M O; Ziegler, A
2013-01-01
Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O'Gorman, Can J Stat 1997; 25: 269 -279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267- 293; Szymczak et al., Stat Med 2013; 32: 524 - 537] for a wide range of distributions. We simulated two-sample scenarios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test performed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.
Exact nonparametric confidence bands for the survivor function.
Matthews, David
2013-10-12
A method to produce exact simultaneous confidence bands for the empirical cumulative distribution function that was first described by Owen, and subsequently corrected by Jager and Wellner, is the starting point for deriving exact nonparametric confidence bands for the survivor function of any positive random variable. We invert a nonparametric likelihood test of uniformity, constructed from the Kaplan-Meier estimator of the survivor function, to obtain simultaneous lower and upper bands for the function of interest with specified global confidence level. The method involves calculating a null distribution and associated critical value for each observed sample configuration. However, Noe recursions and the Van Wijngaarden-Decker-Brent root-finding algorithm provide the necessary tools for efficient computation of these exact bounds. Various aspects of the effect of right censoring on these exact bands are investigated, using as illustrations two observational studies of survival experience among non-Hodgkin's lymphoma patients and a much larger group of subjects with advanced lung cancer enrolled in trials within the North Central Cancer Treatment Group. Monte Carlo simulations confirm the merits of the proposed method of deriving simultaneous interval estimates of the survivor function across the entire range of the observed sample. This research was supported by the Natural Sciences and Engineering Research Council (NSERC) of Canada. It was begun while the author was visiting the Department of Statistics, University of Auckland, and completed during a subsequent sojourn at the Medical Research Council Biostatistics Unit in Cambridge. The support of both institutions, in addition to that of NSERC and the University of Waterloo, is greatly appreciated.
A bias-corrected estimator in multiple imputation for missing data.
Tomita, Hiroaki; Fujisawa, Hironori; Henmi, Masayuki
2018-05-29
Multiple imputation (MI) is one of the most popular methods to deal with missing data, and its use has been rapidly increasing in medical studies. Although MI is rather appealing in practice since it is possible to use ordinary statistical methods for a complete data set once the missing values are fully imputed, the method of imputation is still problematic. If the missing values are imputed from some parametric model, the validity of imputation is not necessarily ensured, and the final estimate for a parameter of interest can be biased unless the parametric model is correctly specified. Nonparametric methods have been also proposed for MI, but it is not so straightforward as to produce imputation values from nonparametrically estimated distributions. In this paper, we propose a new method for MI to obtain a consistent (or asymptotically unbiased) final estimate even if the imputation model is misspecified. The key idea is to use an imputation model from which the imputation values are easily produced and to make a proper correction in the likelihood function after the imputation by using the density ratio between the imputation model and the true conditional density function for the missing variable as a weight. Although the conditional density must be nonparametrically estimated, it is not used for the imputation. The performance of our method is evaluated by both theory and simulation studies. A real data analysis is also conducted to illustrate our method by using the Duke Cardiac Catheterization Coronary Artery Disease Diagnostic Dataset. Copyright © 2018 John Wiley & Sons, Ltd.
Nonparametric Analyses of Log-Periodic Precursors to Financial Crashes
NASA Astrophysics Data System (ADS)
Zhou, Wei-Xing; Sornette, Didier
We apply two nonparametric methods to further test the hypothesis that log-periodicity characterizes the detrended price trajectory of large financial indices prior to financial crashes or strong corrections. The term "parametric" refers here to the use of the log-periodic power law formula to fit the data; in contrast, "nonparametric" refers to the use of general tools such as Fourier transform, and in the present case the Hilbert transform and the so-called (H, q)-analysis. The analysis using the (H, q)-derivative is applied to seven time series ending with the October 1987 crash, the October 1997 correction and the April 2000 crash of the Dow Jones Industrial Average (DJIA), the Standard & Poor 500 and Nasdaq indices. The Hilbert transform is applied to two detrended price time series in terms of the ln(tc-t) variable, where tc is the time of the crash. Taking all results together, we find strong evidence for a universal fundamental log-frequency f=1.02±0.05 corresponding to the scaling ratio λ=2.67±0.12. These values are in very good agreement with those obtained in earlier works with different parametric techniques. This note is extracted from a long unpublished report with 58 figures available at , which extensively describes the evidence we have accumulated on these seven time series, in particular by presenting all relevant details so that the reader can judge for himself or herself the validity and robustness of the results.
It's all relative: ranking the diversity of aquatic bacterial communities.
Shaw, Allison K; Halpern, Aaron L; Beeson, Karen; Tran, Bao; Venter, J Craig; Martiny, Jennifer B H
2008-09-01
The study of microbial diversity patterns is hampered by the enormous diversity of microbial communities and the lack of resources to sample them exhaustively. For many questions about richness and evenness, however, one only needs to know the relative order of diversity among samples rather than total diversity. We used 16S libraries from the Global Ocean Survey to investigate the ability of 10 diversity statistics (including rarefaction, non-parametric, parametric, curve extrapolation and diversity indices) to assess the relative diversity of six aquatic bacterial communities. Overall, we found that the statistics yielded remarkably similar rankings of the samples for a given sequence similarity cut-off. This correspondence, despite the different underlying assumptions of the statistics, suggests that diversity statistics are a useful tool for ranking samples of microbial diversity. In addition, sequence similarity cut-off influenced the diversity ranking of the samples, demonstrating that diversity statistics can also be used to detect differences in phylogenetic structure among microbial communities. Finally, a subsampling analysis suggests that further sequencing from these particular clone libraries would not have substantially changed the richness rankings of the samples.
Anger and depression levels of mothers with premature infants in the neonatal intensive care unit.
Kardaşözdemir, Funda; AKGüN Şahin, Zümrüt
2016-02-04
The aim of this study was to examine anger and depression levels of mothers who had a premature infant in the NICU, and all factors affecting the situation. This descriptive study was performed in the level I and II units of NICU at three state hospitals in Turkey. The data was collected with a demographic questionnaire, "Beck Depression Inventory" and "Anger Expression Scale". Descriptive statistics, parametric and nonparametric statistical tests and Pearson correlation were used in the data analysis. Mothers whose infants are under care in NICU have moderate depression. It has also been determined that mothers' educational level, income level and gender of infants were statistically significant (p <0.05). A positive relationship between depression and trait anger scores was found to be statistically significant. A negative relationship existed between depression and anger-control scores for the mothers, which was statistically significant (p <0.05). Due to the results of research, recommended that mothers who are at risk of depression and anger in the NICU evaluated by nurses and these nurses to develop their consulting roles.
Watanabe, Hiroyuki; Miyazaki, Hiroyasu
2006-01-01
Over- and/or under-correction of QT intervals for changes in heart rate may lead to misleading conclusions and/or masking the potential of a drug to prolong the QT interval. This study examines a nonparametric regression model (Loess Smoother) to adjust the QT interval for differences in heart rate, with an improved fitness over a wide range of heart rates. 240 sets of (QT, RR) observations collected from each of 8 conscious and non-treated beagle dogs were used as the materials for investigation. The fitness of the nonparametric regression model to the QT-RR relationship was compared with four models (individual linear regression, common linear regression, and Bazett's and Fridericia's correlation models) with reference to Akaike's Information Criterion (AIC). Residuals were visually assessed. The bias-corrected AIC of the nonparametric regression model was the best of the models examined in this study. Although the parametric models did not fit, the nonparametric regression model improved the fitting at both fast and slow heart rates. The nonparametric regression model is the more flexible method compared with the parametric method. The mathematical fit for linear regression models was unsatisfactory at both fast and slow heart rates, while the nonparametric regression model showed significant improvement at all heart rates in beagle dogs.
Autism Spectrum and Obsessive–Compulsive Disorders: OC Behaviors, Phenotypes and Genetics
Jacob, Suma; Landeros-Weisenberger, Angeli; Leckman, James F.
2014-01-01
Autism spectrum disorders (ASDs) are a phenotypically and etiologically heterogeneous set of disorders that include obsessive–compulsive behaviors (OCB) that partially overlap with symptoms associated with obsessive–compulsive disorder (OCD). The OCB seen in ASD vary depending on the individual’s mental and chronological age as well as the etiology of their ASD. Although progress has been made in the measurement of the OCB associated with ASD, more work is needed including the potential identification of heritable endophenotypes. Likewise, important progress toward the understanding of genetic influences in ASD has been made by greater refinement of relevant phenotypes using a broad range of study designs, including twin and family-genetic studies, parametric and nonparametric linkage analyses, as well as candidate gene studies and the study of rare genetic variants. These genetic analyses could lead to the refinement of the OCB phenotypes as larger samples are studied and specific associations are replicated. Like ASD, OCB are likely to prove to be multidimensional and polygenic. Some of the vulnerability genes may prove to be generalist genes influencing the phenotypic expression of both ASD and OCD while others will be specific to subcomponents of the ASD phenotype. In order to discover molecular and genetic mechanisms, collaborative approaches need to generate shared samples, resources, novel genomic technologies, as well as more refined phenotypes and innovative statistical approaches. There is a growing need to identify the range of molecular pathways involved in OCB related to ASD in order to develop novel treatment interventions. PMID:20029829
Adaptation of abbreviated mathematics anxiety rating scale for engineering students
NASA Astrophysics Data System (ADS)
Nordin, Sayed Kushairi Sayed; Samat, Khairul Fadzli; Sultan, Al Amin Mohamed; Halim, Bushra Abdul; Ismail, Siti Fatimah; Mafazi, Nurul Wirdah
2015-05-01
Mathematics is an essential and fundamental tool used by engineers to analyse and solve problems in their field. Due to this, most engineering education programs involve a concentration of study in mathematics courses whereby engineering students have to take mathematics courses such as numerical methods, differential equations and calculus in the first two years and continue to do so until the completion of the sequence. However, the students struggled and had difficulties in learning courses that require mathematical abilities. Hence, this study presents the factors that caused mathematics anxiety among engineering students using Abbreviated Mathematics Anxiety Rating Scale (AMARS) through 95 students of Universiti Teknikal Malaysia Melaka (UTeM). From 25 items in AMARS, principal component analysis (PCA) suggested that there are four mathematics anxiety factors, namely experiences of learning mathematics, cognitive skills, mathematics evaluation anxiety and students' perception on mathematics. Minitab 16 software was used to analyse the nonparametric statistics. Kruskal-Wallis Test indicated that there is a significant difference in the experience of learning mathematics and mathematics evaluation anxiety among races. The Chi-Square Test of Independence revealed that the experience of learning mathematics, cognitive skills and mathematics evaluation anxiety depend on the results of their SPM additional mathematics. Based on this study, it is recommended to address the anxiety problems among engineering students at the early stage of studying in the university. Thus, lecturers should play their part by ensuring a positive classroom environment which encourages students to study mathematics without fear.
Statistics 101 for Radiologists.
Anvari, Arash; Halpern, Elkan F; Samir, Anthony E
2015-10-01
Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.
Stroup, Caleb N.; Welhan, John A.; Davis, Linda C.
2008-01-01
The statistical stationarity of distributions of sedimentary interbed thicknesses within the southwestern part of the Idaho National Laboratory (INL) was evaluated within the stratigraphic framework of Quaternary sediments and basalts at the INL site, eastern Snake River Plain, Idaho. The thicknesses of 122 sedimentary interbeds observed in 11 coreholes were documented from lithologic logs and independently inferred from natural-gamma logs. Lithologic information was grouped into composite time-stratigraphic units based on correlations with existing composite-unit stratigraphy near these holes. The assignment of lithologic units to an existing chronostratigraphy on the basis of nearby composite stratigraphic units may introduce error where correlations with nearby holes are ambiguous or the distance between holes is great, but we consider this the best technique for grouping stratigraphic information in this geologic environment at this time. Nonparametric tests of similarity were used to evaluate temporal and spatial stationarity in the distributions of sediment thickness. The following statistical tests were applied to the data: (1) the Kolmogorov-Smirnov (K-S) two-sample test to compare distribution shape, (2) the Mann-Whitney (M-W) test for similarity of two medians, (3) the Kruskal-Wallis (K-W) test for similarity of multiple medians, and (4) Levene's (L) test for the similarity of two variances. Results of these analyses corroborate previous work that concluded the thickness distributions of Quaternary sedimentary interbeds are locally stationary in space and time. The data set used in this study was relatively small, so the results presented should be considered preliminary, pending incorporation of data from more coreholes. Statistical tests also demonstrated that natural-gamma logs consistently fail to detect interbeds less than about 2-3 ft thick, although these interbeds are observable in lithologic logs. This should be taken into consideration when modeling aquifer lithology or hydraulic properties based on lithology.
NASA Astrophysics Data System (ADS)
1981-04-01
The main topics discussed were related to nonparametric statistics, plane and antiplane states in finite elasticity, free-boundary-variational inequalities, the numerical solution of free boundary-value problems, discrete and combinatorial optimization, mathematical modelling in fluid mechanics, a survey and comparison regarding thermodynamic theories, invariant and almost invariant subspaces in linear systems with applications to disturbance isolation, nonlinear acoustics, and methods of function theory in the case of partial differential equations, giving particular attention to elliptic problems in the plane.
Bayesian Nonparametric Statistical Inference for Shock Models and Wear Processes.
1979-12-01
Naval Research under Contract N00014-75-C-0781 and the National Science Foundation under Grant MCS78-01422 with the University of California...SUPPLEMENTARY NOTES Also supported by the National Science Foundation under Grant MCS78-01422. It. 96Y WORDS MOCa’t"u a’ iVWae" side if n*0eaem7 imW~ 149001 b Wek...Barlow and Proschan (1975), among others. The analogy of the shock model in risk and acturial analysis has been given by BUhlmann (1970, Chapter 2
Bailey-Wilson, Joan E; Childs, Erica J; Cropp, Cheryl D; Schaid, Daniel J; Xu, Jianfeng; Camp, Nicola J; Cannon-Albright, Lisa A; Farnham, James M; George, Asha; Powell, Isaac; Carpten, John D; Giles, Graham G; Hopper, John L; Severi, Gianluca; English, Dallas R; Foulkes, William D; Mæhle, Lovise; Møller, Pål; Eeles, Rosalind; Easton, Douglas; Guy, Michelle; Edwards, Steve; Badzioch, Michael D; Whittemore, Alice S; Oakley-Girvan, Ingrid; Hsieh, Chih-Lin; Dimitrov, Latchezar; Stanford, Janet L; Karyadi, Danielle M; Deutsch, Kerry; McIntosh, Laura; Ostrander, Elaine A; Wiley, Kathleen E; Isaacs, Sarah D; Walsh, Patrick C; Thibodeau, Stephen N; McDonnell, Shannon K; Hebbring, Scott; Lange, Ethan M; Cooney, Kathleen A; Tammela, Teuvo L J; Schleutker, Johanna; Maier, Christiane; Bochum, Sylvia; Hoegel, Josef; Grönberg, Henrik; Wiklund, Fredrik; Emanuelsson, Monica; Cancel-Tassin, Geraldine; Valeri, Antoine; Cussenot, Olivier; Isaacs, William B
2012-06-19
Genetic variants are likely to contribute to a portion of prostate cancer risk. Full elucidation of the genetic etiology of prostate cancer is difficult because of incomplete penetrance and genetic and phenotypic heterogeneity. Current evidence suggests that genetic linkage to prostate cancer has been found on several chromosomes including the X; however, identification of causative genes has been elusive. Parametric and non-parametric linkage analyses were performed using 26 microsatellite markers in each of 11 groups of multiple-case prostate cancer families from the International Consortium for Prostate Cancer Genetics (ICPCG). Meta-analyses of the resultant family-specific linkage statistics across the entire 1,323 families and in several predefined subsets were then performed. Meta-analyses of linkage statistics resulted in a maximum parametric heterogeneity lod score (HLOD) of 1.28, and an allele-sharing lod score (LOD) of 2.0 in favor of linkage to Xq27-q28 at 138 cM. In subset analyses, families with average age at onset less than 65 years exhibited a maximum HLOD of 1.8 (at 138 cM) versus a maximum regional HLOD of only 0.32 in families with average age at onset of 65 years or older. Surprisingly, the subset of families with only 2-3 affected men and some evidence of male-to-male transmission of prostate cancer gave the strongest evidence of linkage to the region (HLOD = 3.24, 134 cM). For this subset, the HLOD was slightly increased (HLOD = 3.47 at 134 cM) when families used in the original published report of linkage to Xq27-28 were excluded. Although there was not strong support for linkage to the Xq27-28 region in the complete set of families, the subset of families with earlier age at onset exhibited more evidence of linkage than families with later onset of disease. A subset of families with 2-3 affected individuals and with some evidence of male to male disease transmission showed stronger linkage signals. Our results suggest that the genetic basis for prostate cancer in our families is much more complex than a single susceptibility locus on the X chromosome, and that future explorations of the Xq27-28 region should focus on the subset of families identified here with the strongest evidence of linkage to this region.
Nonparametric Estimation of the Probability of Ruin.
1985-02-01
MATHEMATICS RESEARCH CENTER I E N FREES FEB 85 MRC/TSR...in NONPARAMETRIC ESTIMATION OF THE PROBABILITY OF RUIN Lf Edward W. Frees * Mathematics Research Center University of Wisconsin-Madison 610 Walnut...34 - .. --- - • ’. - -:- - - ..- . . .- -- .-.-. . -. . .- •. . - . . - . . .’ . ’- - .. -’vi . .-" "-- -" ,’- UNIVERSITY OF WISCONSIN-MADISON MATHEMATICS RESEARCH CENTER NONPARAMETRIC ESTIMATION OF THE PROBABILITY
Marginally specified priors for non-parametric Bayesian estimation
Kessler, David C.; Hoff, Peter D.; Dunson, David B.
2014-01-01
Summary Prior specification for non-parametric Bayesian inference involves the difficult task of quantifying prior knowledge about a parameter of high, often infinite, dimension. A statistician is unlikely to have informed opinions about all aspects of such a parameter but will have real information about functionals of the parameter, such as the population mean or variance. The paper proposes a new framework for non-parametric Bayes inference in which the prior distribution for a possibly infinite dimensional parameter is decomposed into two parts: an informative prior on a finite set of functionals, and a non-parametric conditional prior for the parameter given the functionals. Such priors can be easily constructed from standard non-parametric prior distributions in common use and inherit the large support of the standard priors on which they are based. Additionally, posterior approximations under these informative priors can generally be made via minor adjustments to existing Markov chain approximation algorithms for standard non-parametric prior distributions. We illustrate the use of such priors in the context of multivariate density estimation using Dirichlet process mixture models, and in the modelling of high dimensional sparse contingency tables. PMID:25663813
Krewski, Daniel; Burnett, Richard; Jerrett, Michael; Pope, C Arden; Rainham, Daniel; Calle, Eugenia; Thurston, George; Thun, Michael
This article provides an overview of previous analysis and reanalysis of the American Cancer Society (ACS) cohort, along with an indication of current ongoing analyses of the cohort with additional follow-up information through to 2000. Results of the first analysis conducted by Pope et al. (1995) showed that higher average sulfate levels were associated with increased mortality, particularly from cardiopulmonary disease. A reanalysis of the ACS cohort, undertaken by Krewski et al. (2000), found the original risk estimates for fine-particle and sulfate air pollution to be highly robust against alternative statistical techniques and spatial modeling approaches. A detailed investigation of covariate effects found a significant modifying effect of education with risk of mortality associated with fine particles declining with increasing educational attainment. Pope et al. (2002) subsequently reported results of a subsequent study using an additional 10 yr of follow-up of the ACS cohort. This updated analysis included gaseous copollutant and new fine-particle measurements, more comprehensive information on occupational exposures, dietary variables, and the most recent developments in statistical modeling integrating random effects and nonparametric spatial smoothing into the Cox proportional hazards model. Robust associations between ambient fine particulate air pollution and elevated risks of cardiopulmonary and lung cancer mortality were clearly evident, providing the strongest evidence to date that long-term exposure to fine particles is an important health risk. Current ongoing analysis using the extended follow-up information will explore the role of ecologic, economic, and, demographic covariates in the particulate air pollution and mortality association. This analysis will also provide insight into the role of spatial autocorrelation at multiple geographic scales, and whether critical instances in time of exposure to fine particles influence the risk of mortality from cardiopulmonary and lung cancer. Information on the influence of covariates at multiple scales and of critical exposure time windows can assist policymakers in establishing timelines for regulatory interventions that maximize population health benefits.
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS)
Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M.; Khan, Ajmal
2016-01-01
Objective: This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Methods: Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. Results: A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher’s exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher’s exact test, logistic regression, epidemiological statistics, and non-parametric tests. Conclusion: This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design. PMID:27022365
Intraocular pressure and pulsatile ocular blood flow after retrobulbar and peribulbar anaesthesia
Watkins, R.; Beigi, B.; Yates, M.; Chang, B.; Linardos, E.
2001-01-01
AIMS—This study investigated the effect of peribulbar and retrobulbar local anaesthesia on intraocular pressure (IOP) and pulsatile ocular blood flow (POBF), as such anaesthetic techniques may adversely affect these parameters. METHODS—20 eyes of 20 patients who were to undergo phacoemulsification cataract surgery were prospectively randomised to receive peribulbar or retrobulbar anaesthesia. The OBF tonometer (OBF Labs, Wiltshire, UK) was used to simultaneously measure IOP and POBF before anaesthesia and 1 minute and 10 minutes after anaesthesia. Between group comparisons of age, baseline IOP, and baseline POBF were performed using the non-parametric Mann-Whitney test. Within group comparisons of IOP and POBF measured preanaesthesia and post-anaesthesia were performed using the non-parametric Wilcoxon signed ranks test for both groups. RESULTS—There was no statistically significant IOP increase post-anaesthesia in either group. In the group receiving peribulbar anaesthesia, there was a significant reduction in POBF initially post-anaesthesia which recovered after 10 minutes. In the group receiving retrobulbar anaesthesia, there was a persistent statistically significant reduction in POBF. CONCLUSIONS—Retrobulbar and peribulbar injections have little effect on IOP. Ocular compression is not needed for IOP reduction when using local anaesthesia for cataract surgery. Conversely, POBF falls, at least for a short time, when anaesthesia for ophthalmic surgery is administered via a retrobulbar route or a peribulbar route. This reduction may be mediated by pharmacologically altered orbital vascular tone. It may be safer to use other anaesthetic techniques in patients with ocular vascular compromise. PMID:11423451
Trends and shifts in streamflow in Hawaii, 1913-2008
Bassiouni, Maoya; Oki, Delwyn S.
2013-01-01
This study addresses a need to document changes in streamflow and base flow (groundwater discharge to streams) in Hawai'i during the past century. Statistically significant long-term (1913-2008) downward trends were detected (using the nonparametric Mann-Kendall test) in low-streamflow and base-flow records. These long-term downward trends are likely related to a statistically significant downward shift around 1943 detected (using the nonparametric Pettitt test) in index records of streamflow and base flow. The downward shift corresponds to a decrease of 22% in median streamflow and a decrease of 23% in median base flow between the periods 1913-1943 and 1943-2008. The shift coincides with other local and regional factors, including a change from a positive to a negative phase in the Pacific Decadal Oscillation, shifts in the direction of the trade winds over Hawai'i, and a reforestation programme. The detected shift and long-term trends reflect region-wide changes in climatic and land-cover factors. A weak pattern of downward trends in base flows during the period 1943-2008 may indicate a continued decrease in base flows after the 1943 shift. Downward trends were detected more commonly in base-flow records than in high-streamflow, peak-flow, and rainfall records. The decrease in base flow is likely related to a decrease in groundwater storage and recharge and therefore is a valuable indicator of decreasing water availability and watershed vulnerability to hydrologic changes. Whether the downward trends will continue is largely uncertain given the uncertainty in climate-change projections and watershed responses to changes.
Karathanasis, Nestoras; Tsamardinos, Ioannis
2016-01-01
Background The advance of omics technologies has made possible to measure several data modalities on a system of interest. In this work, we illustrate how the Non-Parametric Combination methodology, namely NPC, can be used for simultaneously assessing the association of different molecular quantities with an outcome of interest. We argue that NPC methods have several potential applications in integrating heterogeneous omics technologies, as for example identifying genes whose methylation and transcriptional levels are jointly deregulated, or finding proteins whose abundance shows the same trends of the expression of their encoding genes. Results We implemented the NPC methodology within “omicsNPC”, an R function specifically tailored for the characteristics of omics data. We compare omicsNPC against a range of alternative methods on simulated as well as on real data. Comparisons on simulated data point out that omicsNPC produces unbiased / calibrated p-values and performs equally or significantly better than the other methods included in the study; furthermore, the analysis of real data show that omicsNPC (a) exhibits higher statistical power than other methods, (b) it is easily applicable in a number of different scenarios, and (c) its results have improved biological interpretability. Conclusions The omicsNPC function competitively behaves in all comparisons conducted in this study. Taking into account that the method (i) requires minimal assumptions, (ii) it can be used on different studies designs and (iii) it captures the dependences among heterogeneous data modalities, omicsNPC provides a flexible and statistically powerful solution for the integrative analysis of different omics data. PMID:27812137
NASA Astrophysics Data System (ADS)
Matos, José P.; Schaefli, Bettina; Schleiss, Anton J.
2017-04-01
Uncertainty affects hydrological modelling efforts from the very measurements (or forecasts) that serve as inputs to the more or less inaccurate predictions that are produced. Uncertainty is truly inescapable in hydrology and yet, due to the theoretical and technical hurdles associated with its quantification, it is at times still neglected or estimated only qualitatively. In recent years the scientific community has made a significant effort towards quantifying this hydrologic prediction uncertainty. Despite this, most of the developed methodologies can be computationally demanding, are complex from a theoretical point of view, require substantial expertise to be employed, and are constrained by a number of assumptions about the model error distribution. These assumptions limit the reliability of many methods in case of errors that show particular cases of non-normality, heteroscedasticity, or autocorrelation. The present contribution builds on a non-parametric data-driven approach that was developed for uncertainty quantification in operational (real-time) forecasting settings. The approach is based on the concept of Pareto optimality and can be used as a standalone forecasting tool or as a postprocessor. By virtue of its non-parametric nature and a general operating principle, it can be applied directly and with ease to predictions of streamflow, water stage, or even accumulated runoff. Also, it is a methodology capable of coping with high heteroscedasticity and seasonal hydrological regimes (e.g. snowmelt and rainfall driven events in the same catchment). Finally, the training and operation of the model are very fast, making it a tool particularly adapted to operational use. To illustrate its practical use, the uncertainty quantification method is coupled with a process-based hydrological model to produce statistically reliable forecasts for an Alpine catchment located in Switzerland. Results are presented and discussed in terms of their reliability and resolution.
Feng, Dai; Cortese, Giuliana; Baumgartner, Richard
2017-12-01
The receiver operating characteristic (ROC) curve is frequently used as a measure of accuracy of continuous markers in diagnostic tests. The area under the ROC curve (AUC) is arguably the most widely used summary index for the ROC curve. Although the small sample size scenario is common in medical tests, a comprehensive study of small sample size properties of various methods for the construction of the confidence/credible interval (CI) for the AUC has been by and large missing in the literature. In this paper, we describe and compare 29 non-parametric and parametric methods for the construction of the CI for the AUC when the number of available observations is small. The methods considered include not only those that have been widely adopted, but also those that have been less frequently mentioned or, to our knowledge, never applied to the AUC context. To compare different methods, we carried out a simulation study with data generated from binormal models with equal and unequal variances and from exponential models with various parameters and with equal and unequal small sample sizes. We found that the larger the true AUC value and the smaller the sample size, the larger the discrepancy among the results of different approaches. When the model is correctly specified, the parametric approaches tend to outperform the non-parametric ones. Moreover, in the non-parametric domain, we found that a method based on the Mann-Whitney statistic is in general superior to the others. We further elucidate potential issues and provide possible solutions to along with general guidance on the CI construction for the AUC when the sample size is small. Finally, we illustrate the utility of different methods through real life examples.
Nonparametric test of consistency between cosmological models and multiband CMB measurements
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aghamousa, Amir; Shafieloo, Arman, E-mail: amir@apctp.org, E-mail: shafieloo@kasi.re.kr
2015-06-01
We present a novel approach to test the consistency of the cosmological models with multiband CMB data using a nonparametric approach. In our analysis we calibrate the REACT (Risk Estimation and Adaptation after Coordinate Transformation) confidence levels associated with distances in function space (confidence distances) based on the Monte Carlo simulations in order to test the consistency of an assumed cosmological model with observation. To show the applicability of our algorithm, we confront Planck 2013 temperature data with concordance model of cosmology considering two different Planck spectra combination. In order to have an accurate quantitative statistical measure to compare betweenmore » the data and the theoretical expectations, we calibrate REACT confidence distances and perform a bias control using many realizations of the data. Our results in this work using Planck 2013 temperature data put the best fit ΛCDM model at 95% (∼ 2σ) confidence distance from the center of the nonparametric confidence set while repeating the analysis excluding the Planck 217 × 217 GHz spectrum data, the best fit ΛCDM model shifts to 70% (∼ 1σ) confidence distance. The most prominent features in the data deviating from the best fit ΛCDM model seems to be at low multipoles 18 < ℓ < 26 at greater than 2σ, ℓ ∼ 750 at ∼1 to 2σ and ℓ ∼ 1800 at greater than 2σ level. Excluding the 217×217 GHz spectrum the feature at ℓ ∼ 1800 becomes substantially less significance at ∼1 to 2σ confidence level. Results of our analysis based on the new approach we propose in this work are in agreement with other analysis done using alternative methods.« less
Recent trends in counts of migrant hawks from northeastern North America
Titus, K.; Fuller, M.R.
1990-01-01
Using simple regression, pooled-sites route-regression, and nonparametric rank-trend analyses, we evaluated trends in counts of hawks migrating past 6 eastern hawk lookouts from 1972 to 1987. The indexing variable was the total count for a season. Bald eagle (Haliaeetus leucocephalus), peregrine falcon (Falco peregrinus), merlin (F. columbarius), osprey (Pandion haliaetus), and Cooper's hawk (Accipiter cooperii) counts increased using route-regression and nonparametric methods (P 0.10). We found no consistent trends (P > 0.10) in counts of sharp-shinned hawks (A. striatus), northern goshawks (A. gentilis) red-shouldered hawks (Buteo lineatus), red-tailed hawks (B. jamaicensis), rough-legged hawsk (B. lagopus), and American kestrels (F. sparverius). Broad-winged hawk (B. platypterus) counts declined (P < 0.05) based on the route-regression method. Empirical comparisons of our results with those for well-studied species such as the peregrine falcon, bald eagle, and osprey indicated agreement with nesting surveys. We suggest that counts of migrant hawks are a useful and economical method for detecting long-term trends in species across regions, particularly for species that otherwise cannot be easily surveyed.
Abecasis, Gonçalo R; Burt, Rachel A; Hall, Diana; Bochum, Sylvia; Doheny, Kimberly F; Lundy, S Laura; Torrington, Marie; Roos, J Louw; Gogos, Joseph A; Karayiorgou, Maria
2004-03-01
We report on our initial genetic linkage studies of schizophrenia in the genetically isolated population of the Afrikaners from South Africa. A 10-cM genomewide scan was performed on 143 small families, 34 of which were informative for linkage. Using both nonparametric and parametric linkage analyses, we obtained evidence for a small number of disease loci on chromosomes 1, 9, and 13. These results suggest that few genes of substantial effect exist for schizophrenia in the Afrikaner population, consistent with our previous genealogical tracing studies. The locus on chromosome 1 reached genomewide significance levels (nonparametric LOD score of 3.30 at marker D1S1612, corresponding to an empirical P value of.012) and represents a novel susceptibility locus for schizophrenia. In addition to providing evidence for linkage for chromosome 1, we also identified a proband with a uniparental disomy (UPD) of the entire chromosome 1. This is the first time a UPD has been described in a patient with schizophrenia, lending further support to involvement of chromosome 1 in schizophrenia susceptibility in the Afrikaners.
NASA Astrophysics Data System (ADS)
Kovalenko, I. D.; Doressoundiram, A.; Lellouch, E.; Vilenius, E.; Müller, T.; Stansberry, J.
2017-11-01
Context. Gravitationally bound multiple systems provide an opportunity to estimate the mean bulk density of the objects, whereas this characteristic is not available for single objects. Being a primitive population of the outer solar system, binary and multiple trans-Neptunian objects (TNOs) provide unique information about bulk density and internal structure, improving our understanding of their formation and evolution. Aims: The goal of this work is to analyse parameters of multiple trans-Neptunian systems, observed with Herschel and Spitzer space telescopes. Particularly, statistical analysis is done for radiometric size and geometric albedo, obtained from photometric observations, and for estimated bulk density. Methods: We use Monte Carlo simulation to estimate the real size distribution of TNOs. For this purpose, we expand the dataset of diameters by adopting the Minor Planet Center database list with available values of the absolute magnitude therein, and the albedo distribution derived from Herschel radiometric measurements. We use the 2-sample Anderson-Darling non-parametric statistical method for testing whether two samples of diameters, for binary and single TNOs, come from the same distribution. Additionally, we use the Spearman's coefficient as a measure of rank correlations between parameters. Uncertainties of estimated parameters together with lack of data are taken into account. Conclusions about correlations between parameters are based on statistical hypothesis testing. Results: We have found that the difference in size distributions of multiple and single TNOs is biased by small objects. The test on correlations between parameters shows that the effective diameter of binary TNOs strongly correlates with heliocentric orbital inclination and with magnitude difference between components of binary system. The correlation between diameter and magnitude difference implies that small and large binaries are formed by different mechanisms. Furthermore, the statistical test indicates, although not significant with the sample size, that a moderately strong correlation exists between diameter and bulk density. Herschel is an ESA space observatory with science instruments provided by European-led Principal Investigator consortia and with important participation from NASA.
Hazardous medical waste generation rates of different categories of health-care facilities
DOE Office of Scientific and Technical Information (OSTI.GOV)
Komilis, Dimitrios, E-mail: dkomilis@env.duth.gr; Fouki, Anastassia; Papadopoulos, Dimitrios
Highlights: Black-Right-Pointing-Pointer We calculated hazardous medical waste generation rates (HMWGR) from 132 hospitals. Black-Right-Pointing-Pointer Based on a 22-month study period, HMWGR were highly skewed to the right. Black-Right-Pointing-Pointer The HMWGR varied from 0.00124 to 0.718 kg bed{sup -1} d{sup -1}. Black-Right-Pointing-Pointer A positive correlation existed between the HMWGR and the number of hospital beds. Black-Right-Pointing-Pointer We used non-parametric statistics to compare rates among hospital categories. - Abstract: Goal of this work was to calculate the hazardous medical waste unit generation rates (HMWUGR), in kg bed{sup -1} d{sup -1}, using data from 132 health-care facilities in Greece. The calculations were basedmore » on the weights of the hazardous medical wastes that were regularly transferred to the sole medical waste incinerator in Athens over a 22-month period during years 2009 and 2010. The 132 health-care facilities were grouped into public and private ones, and, also, into seven sub-categories, namely: birth, cancer treatment, general, military, pediatric, psychiatric and university hospitals. Results showed that there is a large variability in the HMWUGR, even among hospitals of the same category. Average total HMWUGR varied from 0.012 kg bed{sup -1} d{sup -1}, for the public psychiatric hospitals, to up to 0.72 kg bed{sup -1} d{sup -1}, for the public university hospitals. Within the private hospitals, average HMWUGR ranged from 0.0012 kg bed{sup -1} d{sup -1}, for the psychiatric clinics, to up to 0.49 kg bed{sup -1} d{sup -1}, for the birth clinics. Based on non-parametric statistics, HMWUGR were statistically similar for the birth and general hospitals, in both the public and private sector. The private birth and general hospitals generated statistically more wastes compared to the corresponding public hospitals. The infectious/toxic and toxic medical wastes appear to be 10% and 50% of the total hazardous medical wastes generated by the public cancer treatment and university hospitals, respectively.« less
Alternating event processes during lifetimes: population dynamics and statistical inference.
Shinohara, Russell T; Sun, Yifei; Wang, Mei-Cheng
2018-01-01
In the literature studying recurrent event data, a large amount of work has been focused on univariate recurrent event processes where the occurrence of each event is treated as a single point in time. There are many applications, however, in which univariate recurrent events are insufficient to characterize the feature of the process because patients experience nontrivial durations associated with each event. This results in an alternating event process where the disease status of a patient alternates between exacerbations and remissions. In this paper, we consider the dynamics of a chronic disease and its associated exacerbation-remission process over two time scales: calendar time and time-since-onset. In particular, over calendar time, we explore population dynamics and the relationship between incidence, prevalence and duration for such alternating event processes. We provide nonparametric estimation techniques for characteristic quantities of the process. In some settings, exacerbation processes are observed from an onset time until death; to account for the relationship between the survival and alternating event processes, nonparametric approaches are developed for estimating exacerbation process over lifetime. By understanding the population dynamics and within-process structure, the paper provide a new and general way to study alternating event processes.
A Machine Learning Framework for Plan Payment Risk Adjustment.
Rose, Sherri
2016-12-01
To introduce cross-validation and a nonparametric machine learning framework for plan payment risk adjustment and then assess whether they have the potential to improve risk adjustment. 2011-2012 Truven MarketScan database. We compare the performance of multiple statistical approaches within a broad machine learning framework for estimation of risk adjustment formulas. Total annual expenditure was predicted using age, sex, geography, inpatient diagnoses, and hierarchical condition category variables. The methods included regression, penalized regression, decision trees, neural networks, and an ensemble super learner, all in concert with screening algorithms that reduce the set of variables considered. The performance of these methods was compared based on cross-validated R 2 . Our results indicate that a simplified risk adjustment formula selected via this nonparametric framework maintains much of the efficiency of a traditional larger formula. The ensemble approach also outperformed classical regression and all other algorithms studied. The implementation of cross-validated machine learning techniques provides novel insight into risk adjustment estimation, possibly allowing for a simplified formula, thereby reducing incentives for increased coding intensity as well as the ability of insurers to "game" the system with aggressive diagnostic upcoding. © Health Research and Educational Trust.
Xu, Maoqi; Chen, Liang
2018-01-01
The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
NASA Astrophysics Data System (ADS)
Li, Zhengxiang; Gonzalez, J. E.; Yu, Hongwei; Zhu, Zong-Hong; Alcaniz, J. S.
2016-02-01
We apply two methods, i.e., the Gaussian processes and the nonparametric smoothing procedure, to reconstruct the Hubble parameter H (z ) as a function of redshift from 15 measurements of the expansion rate obtained from age estimates of passively evolving galaxies. These reconstructions enable us to derive the luminosity distance to a certain redshift z , calibrate the light-curve fitting parameters accounting for the (unknown) intrinsic magnitude of type Ia supernova (SNe Ia), and construct cosmological model-independent Hubble diagrams of SNe Ia. In order to test the compatibility between the reconstructed functions of H (z ), we perform a statistical analysis considering the latest SNe Ia sample, the so-called joint light-curve compilation. We find that, for the Gaussian processes, the reconstructed functions of Hubble parameter versus redshift, and thus the following analysis on SNe Ia calibrations and cosmological implications, are sensitive to prior mean functions. However, for the nonparametric smoothing method, the reconstructed functions are not dependent on initial guess models, and consistently require high values of H0, which are in excellent agreement with recent measurements of this quantity from Cepheids and other local distance indicators.
NASA Astrophysics Data System (ADS)
Mulyani, Sri; Andriyana, Yudhie; Sudartianto
2017-03-01
Mean regression is a statistical method to explain the relationship between the response variable and the predictor variable based on the central tendency of the data (mean) of the response variable. The parameter estimation in mean regression (with Ordinary Least Square or OLS) generates a problem if we apply it to the data with a symmetric, fat-tailed, or containing outlier. Hence, an alternative method is necessary to be used to that kind of data, for example quantile regression method. The quantile regression is a robust technique to the outlier. This model can explain the relationship between the response variable and the predictor variable, not only on the central tendency of the data (median) but also on various quantile, in order to obtain complete information about that relationship. In this study, a quantile regression is developed with a nonparametric approach such as smoothing spline. Nonparametric approach is used if the prespecification model is difficult to determine, the relation between two variables follow the unknown function. We will apply that proposed method to poverty data. Here, we want to estimate the Percentage of Poor People as the response variable involving the Human Development Index (HDI) as the predictor variable.
Nonparametric Bayesian clustering to detect bipolar methylated genomic loci.
Wu, Xiaowei; Sun, Ming-An; Zhu, Hongxiao; Xie, Hehuang
2015-01-16
With recent development in sequencing technology, a large number of genome-wide DNA methylation studies have generated massive amounts of bisulfite sequencing data. The analysis of DNA methylation patterns helps researchers understand epigenetic regulatory mechanisms. Highly variable methylation patterns reflect stochastic fluctuations in DNA methylation, whereas well-structured methylation patterns imply deterministic methylation events. Among these methylation patterns, bipolar patterns are important as they may originate from allele-specific methylation (ASM) or cell-specific methylation (CSM). Utilizing nonparametric Bayesian clustering followed by hypothesis testing, we have developed a novel statistical approach to identify bipolar methylated genomic regions in bisulfite sequencing data. Simulation studies demonstrate that the proposed method achieves good performance in terms of specificity and sensitivity. We used the method to analyze data from mouse brain and human blood methylomes. The bipolar methylated segments detected are found highly consistent with the differentially methylated regions identified by using purified cell subsets. Bipolar DNA methylation often indicates epigenetic heterogeneity caused by ASM or CSM. With allele-specific events filtered out or appropriately taken into account, our proposed approach sheds light on the identification of cell-specific genes/pathways under strong epigenetic control in a heterogeneous cell population.
Integrative genetic risk prediction using non-parametric empirical Bayes classification.
Zhao, Sihai Dave
2017-06-01
Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.
Yap, John Stephen; Fan, Jianqing; Wu, Rongling
2009-12-01
Estimation of the covariance structure of longitudinal processes is a fundamental prerequisite for the practical deployment of functional mapping designed to study the genetic regulation and network of quantitative variation in dynamic complex traits. We present a nonparametric approach for estimating the covariance structure of a quantitative trait measured repeatedly at a series of time points. Specifically, we adopt Huang et al.'s (2006, Biometrika 93, 85-98) approach of invoking the modified Cholesky decomposition and converting the problem into modeling a sequence of regressions of responses. A regularized covariance estimator is obtained using a normal penalized likelihood with an L(2) penalty. This approach, embedded within a mixture likelihood framework, leads to enhanced accuracy, precision, and flexibility of functional mapping while preserving its biological relevance. Simulation studies are performed to reveal the statistical properties and advantages of the proposed method. A real example from a mouse genome project is analyzed to illustrate the utilization of the methodology. The new method will provide a useful tool for genome-wide scanning for the existence and distribution of quantitative trait loci underlying a dynamic trait important to agriculture, biology, and health sciences.
The roles of autophagy and hypoxia in human inflammatory periapical lesions.
Huang, H Y; Wang, W C; Lin, P Y; Huang, C P; Chen, C Y; Chen, Y K
2018-02-01
To determine the expressions of hypoxia-related [hypoxia-inducible transcription factors (HIF)-1α, BCL2/adenovirus E1B 19 kDa protein-interacting protein 3 (BNIP3) and phospho-adenosine monophosphate activated protein kinase (pAMPK)] and autophagy-related [microtubule-associated protein 1 light chain 3 (LC3), beclin-1 (BECN-1), autophagy-related gene (Atg)5-12, and p62] proteins in human inflammatory periapical lesions. Fifteen samples of radicular cysts (RCs) and 21 periapical granulomas (PGs), combined with 17 healthy dental pulp tissues, were examined. Enzyme-linked immunosorbent assay (ELISA) was used to detect interleukin (IL)-1β cytokine; immunohistochemical (IHC) and Western blot (WB) analyses were employed to examine autophagy-related and hypoxia-related proteins. Transmission electron microscopy (TEM) was used to explore the ultrastructural morphology of autophagy in periapical lesions. Nonparametric Kruskal-Wallis tests and Mann-Whitney U-tests were used for statistical analyses. ELISA revealed a significantly higher (P < 0.001) IL-1β expression in periapical lesions than in normal pulp tissue. Immunoscores of IHC expressions of pAMPK, HIF-1α, BNIP3, BECN-1 and Atg5-12 proteins in periapical lesions were significantly higher (P < 0.001) (except BECN-1) than those in normal pulp tissue. The results of IHC studies were largely compatible with those of WB analyses, where significantly higher (P < 0.05) expressions of hypoxia-related and autophagy-related proteins (except BECN-1, p62 and LC3II in WB analyses) in periapical lesions were noted as compared to normal pulp tissue. Upon TEM, ultrastructural double-membrane autophagosomes and autolysosomes were observed in PGs and RCs. Autophagy associated with hypoxia may play a potential causative role in the development and maintenance of inflamed periapical lesions. © 2017 International Endodontic Journal. Published by John Wiley & Sons Ltd.
Verification of forecast ensembles in complex terrain including observation uncertainty
NASA Astrophysics Data System (ADS)
Dorninger, Manfred; Kloiber, Simon
2017-04-01
Traditionally, verification means to verify a forecast (ensemble) with the truth represented by observations. The observation errors are quite often neglected arguing that they are small when compared to the forecast error. In this study as part of the MesoVICT (Mesoscale Verification Inter-comparison over Complex Terrain) project it will be shown, that observation errors have to be taken into account for verification purposes. The observation uncertainty is estimated from the VERA (Vienna Enhanced Resolution Analysis) and represented via two analysis ensembles which are compared to the forecast ensemble. For the whole study results from COSMO-LEPS provided by Arpae-SIMC Emilia-Romagna are used as forecast ensemble. The time period covers the MesoVICT core case from 20-22 June 2007. In a first step, all ensembles are investigated concerning their distribution. Several tests have been executed (Kolmogorov-Smirnov-Test, Finkelstein-Schafer Test, Chi-Square Test etc.) showing no exact mathematical distribution. So the main focus is on non-parametric statistics (e.g. Kernel density estimation, Boxplots etc.) and also the deviation between "forced" normal distributed data and the kernel density estimations. In a next step the observational deviations due to the analysis ensembles are analysed. In a first approach scores are multiple times calculated with every single ensemble member from the analysis ensemble regarded as "true" observation. The results are presented as boxplots for the different scores and parameters. Additionally, the bootstrapping method is also applied to the ensembles. These possible approaches to incorporating observational uncertainty into the computation of statistics will be discussed in the talk.
Van Emon, Jeanette M.; Chuang, Jane C.; Lordo, Robert A.; Schrock, Mary E.; Nichkova, Mikaela; Gee, Shirley J.; Hammock, Bruce D.
2010-01-01
A 96-microwell enzyme-linked immunosorbent assay (ELISA) method was evaluated to determine PCDDs/PCDFs in sediment and soil samples from an EPA Superfund site. Samples were prepared and analyzed by both the ELISA and a gas chromatography/high resolution mass spectrometry (GC/HRMS) method. Comparable method precision, accuracy, and detection level (8 ng kg−1) were achieved by the ELISA method with respect to GC/HRMS. However, the extraction and cleanup method developed for the ELISA requires refinement for the soil type that yielded a waxy residue after sample processing. Four types of statistical analyses (Pearson correlation coefficient, paired t-test, nonparametric tests, and McNemar’s test of association) were performed to determine whether the two methods produced statistically different results. The log-transformed ELISA-derived 2,3,7,8-tetrachlorodibenzo-p-dioxin values and logtransformed GC/HRMS-derived TEQ values were significantly correlated (r = 0.79) at the 0.05 level. The median difference in values between ELISA and GC/HRMS was not significant at the 0.05 level. Low false negative and false positive rates (<10%) were observed for the ELISA when compared to the GC/HRMS at 1000 ng TEQ kg−1. The findings suggest that immunochemical technology could be a complementary monitoring tool for determining concentrations at the 1000 ng TEQ kg−1 action level for contaminated sediment and soil. The ELISA could also be used in an analytical triage approach to screen and rank samples prior to instrumental analysis. PMID:18313102
Beecroft, E V; Durham, J; Thomson, P
2013-03-01
To gain a deeper understanding of the clinical journey taken by orofacial pain patients from initial presentation in primary care to treatment by oral and maxillofacial surgery. Retrospective audit. Data were collected from 101 consecutive patients suffering from chronic orofacial pain, attending oral and maxillofacial surgery clinics between 2009 and 2010. Once the patients were identified, information was drawn from their hospital records and referral letters, and a predesigned proforma was completed by a single examiner (EVB). Basic descriptive statistics and non-parametric inferential statistical techniques (Krushal-Wallis) were used to analyse the data. DATA AND DISCUSSION: Six definitive orofacial pain conditions were represented in the data set, 75% of which were temporomandibular disorders (TMD). Individuals within our study were treated in nine different hospital settings and were referred to 15 distinct specialties. The mean number of consultations received by the patients in our study across all care settings is seven (SD 5). The mean number of specialities that the subjects were assessed by was three (SD 1). The sample set had a total of 341 treatment attempts to manage their chronic orofacial pain conditions, of which only 83 (24%) of all the treatments attempted yielded a successful outcome. Improved education and remuneration for primary care practitioners as well as clear care pathways for patients with chronic orofacial pain should be established to reduce multiple re-referrals and improve efficiency of care. The creation of specialist regional centres for chronic orofacial pain may be considered to manage severe cases and drive evidence-based practice.
Behnke, Anke; Bunge, John; Barger, Kathryn; Breiner, Hans-Werner; Alla, Victoria; Stoeck, Thorsten
2006-01-01
To resolve the fine-scale architecture of anoxic protistan communities, we conducted a cultivation-independent 18S rRNA survey in the superanoxic Framvaren Fjord in Norway. We generated three clone libraries along the steep O2/H2S gradient, using the multiple-primer approach. Of 1,100 clones analyzed, 753 proved to be high-quality protistan target sequences. These sequences were grouped into 92 phylotypes, which displayed high protistan diversity in the fjord (17 major eukaryotic phyla). Only a few were closely related to known taxa. Several sequences were dissimilar to all previously described sequences and occupied a basal position in the inferred phylogenies, suggesting that the sequences recovered were derived from novel, deeply divergent eukaryotes. We detected sequence clades with evolutionary importance (for example, clades in the euglenozoa) and clades that seem to be specifically adapted to anoxic environments, challenging the hypothesis that the global dispersal of protists is uniform. Moreover, with the detection of clones affiliated with jakobid flagellates, we present evidence that primitive descendants of early eukaryotes are present in this anoxic environment. To estimate sample coverage and phylotype richness, we used parametric and nonparametric statistical methods. The results show that although our data set is one of the largest published inventories, our sample missed a substantial proportion of the protistan diversity. Nevertheless, statistical and phylogenetic analyses of the three libraries revealed the fine-scale architecture of anoxic protistan communities, which may exhibit adaptation to different environmental conditions along the O2/H2S gradient. PMID:16672511
Browne, Richard W; Whitcomb, Brian W
2010-07-01
Problems in the analysis of laboratory data commonly arise in epidemiologic studies in which biomarkers subject to lower detection thresholds are used. Various thresholds exist including limit of detection (LOD), limit of quantification (LOQ), and limit of blank (LOB). Choosing appropriate strategies for dealing with data affected by such limits relies on proper understanding of the nature of the detection limit and its determination. In this paper, we demonstrate experimental and statistical procedures generally used for estimating different detection limits according to standard procedures in the context of analysis of fat-soluble vitamins and micronutrients in human serum. Fat-soluble vitamins and micronutrients were analyzed by high-performance liquid chromatography with diode array detection. A simulated serum matrix blank was repeatedly analyzed for determination of LOB parametrically by using the observed blank distribution as well as nonparametrically by using ranks. The LOD was determined by combining information regarding the LOB with data from repeated analysis of standard reference materials (SRMs), diluted to low levels; from LOB to 2-3 times LOB. The LOQ was determined experimentally by plotting the observed relative standard deviation (RSD) of SRM replicates compared with the concentration, where the LOQ is the concentration at an RSD of 20%. Experimental approaches and example statistical procedures are given for determination of LOB, LOD, and LOQ. These quantities are reported for each measured analyte. For many analyses, there is considerable information available below the LOQ. Epidemiologic studies must understand the nature of these detection limits and how they have been estimated for appropriate treatment of affected data.
Buyuk, C; Gunduz, K; Avsever, H
2018-01-01
The aim of this investigation was to evaluate the length, thickness, sagittal and transverse angulations and the morphological variations of the stylohyoid complex (SHC), to assess their probable associations with age and gender, and to investigate the prevalence of it in a wide range of a Turkish sub-population by using cone beam computed tomography (CBCT). The CBCT images of the 1000 patients were evaluated retrospectively. The length, thickness, sagittal and transverse angulations, morphological variations and ossification degrees of SHC were evaluated on multiplanar reconstructions (MPR) adnd three-dimensional (3D) volume rendering (3DVR) images. The data were analysed statistically by using nonparametric tests, Pearson's correlation coefficient, Student's t test, c2 test and one-way ANOVA. Statistical significance was considered at p < 0.05. It was determined that 684 (34.2%) of all 2000 SHCs were elongated (> 35 mm). The mean sagittal angle value was measured to be 72.24° and the mean transverse angle value was 70.81°. Scalariform shape, elongated type and nodular calcification pattern have the highest mean age values between the morphological groups, respectively. Calcified outline was the most prevalent calcification pattern in males. There was no correlation between length and the calcification pattern groups while scalariform shape and pseudoarticular type were the longest variations. We observed that as the anterior sagittal angle gets wider, SHC tends to get longer. The most observed morphological variations were linear shape, elongated type and calcified outline pattern. Detailed studies on the classification will contribute to the literature. (Folia Morphol 2018; 77, 1: 79-89).
The influence of toothbrushing and coffee staining on different composite surface coatings.
Zimmerli, Brigitte; Koch, Tamara; Flury, Simon; Lussi, Adrian
2012-04-01
The aim of our study is to evaluate the performance of surface sealants and conventional polishing after ageing procedures. Eighty circular composite restorations were performed on extracted human molars. After standardised roughening, the restorations were either sealed with one of three surface sealants (Lasting Touch (LT), BisCover LV (BC), G-Coat Plus (GP) or a dentin adhesive Heliobond (HB)) or were manually polished with silicon polishers (MP) (n = 16). The average roughness (Ra) and colourimetric parameters (CP) (L*a*b*) were evaluated. The specimens underwent an artificial ageing process by thermocycling, staining (coffee) and abrasive (toothbrushing) procedures. After each ageing step, Ra and CP measurements were repeated. A qualitative surface analysis was performed with SEM. The differences between the test groups regarding Ra and CP values were analysed with nonparametric ANOVA analysis (α = 0.05). The lowest Ra values were achieved with HB. BC and GP resulted in Ra values below 0.2 μm (clinically relevant threshold), whereas LT and MP sometimes led to higher Ra values. LT showed a significantly higher discolouration after the first coffee staining, but this was normalised to the other groups after toothbrushing. The differences between the measurements and test groups for Ra and CP were statistically significant. However, the final colour difference showed no statistical difference among the five groups. SEM evaluation showed clear alterations after ageing in all coating groups. Surface sealants and dentin adhesives have the potential to reduce surface roughness but tend to debond over time. Surface sealants can only be recommended for polishing provisional restorations.
Geoscience in the Big Data Era: Are models obsolete?
NASA Astrophysics Data System (ADS)
Yuen, D. A.; Zheng, L.; Stark, P. B.; Morra, G.; Knepley, M.; Wang, X.
2016-12-01
In last few decades, the velocity, volume, and variety of geophysical data have increased, while the development of the Internet and distributed computing has led to the emergence of "data science." Fitting and running numerical models, especially based on PDEs, is the main consumer of flops in geoscience. Can large amounts of diverse data supplant modeling? Without the ability to conduct randomized, controlled experiments, causal inference requires understanding the physics. It is sometimes possible to predict well without understanding the system—if (1) the system is predictable, (2) data on "important" variables are available, and (3) the system changes slowly enough. And sometimes even a crude model can help the data "speak for themselves" much more clearly. For example, Shearer (1991) used a 1-dimensional velocity model to stack long-period seismograms, revealing upper mantle discontinuities. This was a "big data" approach: the main use of computing was in the data processing, rather than in modeling, yet the "signal" became clear. In contrast, modelers tend to use all available computing power to fit even more complex models, resulting in a cycle where uncertainty quantification (UQ) is never possible: even if realistic UQ required only 1,000 model evaluations, it is never in reach. To make more reliable inferences requires better data analysis and statistics, not more complex models. Geoscientists need to learn new skills and tools: sound software engineering practices; open programming languages suitable for big data; parallel and distributed computing; data visualization; and basic nonparametric, computationally based statistical inference, such as permutation tests. They should work reproducibly, scripting all analyses and avoiding point-and-click tools.
Cross-validation and Peeling Strategies for Survival Bump Hunting using Recursive Peeling Methods
Dazard, Jean-Eudes; Choe, Michael; LeBlanc, Michael; Rao, J. Sunil
2015-01-01
We introduce a framework to build a survival/risk bump hunting model with a censored time-to-event response. Our Survival Bump Hunting (SBH) method is based on a recursive peeling procedure that uses a specific survival peeling criterion derived from non/semi-parametric statistics such as the hazards-ratio, the log-rank test or the Nelson--Aalen estimator. To optimize the tuning parameter of the model and validate it, we introduce an objective function based on survival or prediction-error statistics, such as the log-rank test and the concordance error rate. We also describe two alternative cross-validation techniques adapted to the joint task of decision-rule making by recursive peeling and survival estimation. Numerical analyses show the importance of replicated cross-validation and the differences between criteria and techniques in both low and high-dimensional settings. Although several non-parametric survival models exist, none addresses the problem of directly identifying local extrema. We show how SBH efficiently estimates extreme survival/risk subgroups unlike other models. This provides an insight into the behavior of commonly used models and suggests alternatives to be adopted in practice. Finally, our SBH framework was applied to a clinical dataset. In it, we identified subsets of patients characterized by clinical and demographic covariates with a distinct extreme survival outcome, for which tailored medical interventions could be made. An R package PRIMsrc (Patient Rule Induction Method in Survival, Regression and Classification settings) is available on CRAN (Comprehensive R Archive Network) and GitHub. PMID:27034730
Temporal changes in water quality at a childhood leukemia cluster.
Seiler, Ralph L
2004-01-01
Since 1997, 15 cases of acute lymphocytic leukemia and one case of acute myelocytic leukemia have been diagnosed in children and teenagers who live, or have lived, in an area centered on the town of Fallon, Nevada. The expected rate for the population is about one case every five years. In 2001, 99 domestic and municipal wells and one industrial well were sampled in the Fallon area. Twenty-nine of these wells had been sampled previously in 1989. Statistical comparison of concentrations of major ions and trace elements in those 29 wells between 1989 and 2001 using the nonparametric Wilcoxon signed-rank test indicate water quality did not substantially change over that period; however, short-term changes may have occurred that were not detected. Volatile organic compounds were seldom detected in ground water samples and those that are regulated were consistently found at concentrations less than the maximum contaminant level (MCL). The MCL for gross-alpha radioactivity and arsenic, radon, and uranium concentrations were commonly exceeded, and sometimes were greatly exceeded. Statistical comparisons using the nonparametric Wilcoxon rank-sum test indicate gross-alpha and -beta radioactivity, arsenic, uranium, and radon concentrations in wells used by families having a child with leukemia did not statistically differ from the remainder of the domestic wells sampled during this investigation. Isotopic measurements indicate the uranium was natural and not the result of a 1963 underground nuclear bomb test near Fallon. In arid and semiarid areas where trace-element concentrations can greatly exceed the MCL, household reverse-osmosis units may not reduce their concentrations to safe levels. In parts of the world where radon concentrations are high, water consumed first thing in the morning may be appreciably more radioactive than water consumed a few minutes later after the pressure tank has been emptied because secular equilibrium between radon and its immediate daughter progeny is attained in pressure tanks overnight.
Temporal changes in water quality at a childhood leukemia cluster
Seiler, R.L.
2004-01-01
Since 1997, 15 cases of acute lymphocytic leukemia and one case of acute myelocytic leukemia have been diagnosed in children and teenagers who live, or have lived, in an area centered on the town of Fallon, Nevada. The expected rate for the population is about one case every five years. In 2001, 99 domestic and municipal wells and one industrial well were sampled in the Fallon area. Twenty-nine of these wells had been sampled previously in 1989. Statistical comparison of concentrations of major ions and trace elements in those 29 wells between 1989 and 2001 using the nonparametric Wilcoxon signed-rank test indicate water quality did not substantially change over that period; however, short-term changes may have occurred that were not detected. Volatile organic compounds were seldom detected in ground water samples and those that are regulated were consistently found at concentrations less than the maximum contaminant level (MCL). The MCL for gross-alpha radioactivity and arsenic, radon, and uranium concentrations were commonly exceeded, and sometimes were greatly exceeded. Statistical comparisons using the nonparametric Wilcoxon rank-sum test indicate gross-alpha and -beta radioactivity, arsenic, uranium, and radon concentrations in wells used by families having a child with leukemia did not statistically differ from the remainder of the domestic wells sampled during this investigation. Isotopic measurements indicate the uranium was natural and not the result of a 1963 underground nuclear bomb test near Fallon. In arid and semiarid areas where trace-element concentrations can greatly exceed the MCL, household reverse-osmosis units may not reduce their concentrations to safe levels. In parts of the world where radon concentrations are high, water consumed first thing in the morning may be appreciably more radioactive than water consumed a few minutes later after the pressure tank has been emptied because secular equilibrium between radon and its immediate daughter progeny is attained in pressure tanks overnight.
Introduction to multivariate discrimination
NASA Astrophysics Data System (ADS)
Kégl, Balázs
2013-07-01
Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either relevant to or even motivated by certain unorthodox applications of multivariate discrimination in experimental physics.
Generalized Hurst exponent estimates differentiate EEG signals of healthy and epileptic patients
NASA Astrophysics Data System (ADS)
Lahmiri, Salim
2018-01-01
The aim of our current study is to check whether multifractal patterns of the electroencephalographic (EEG) signals of normal and epileptic patients are statistically similar or different. In this regard, the generalized Hurst exponent (GHE) method is used for robust estimation of the multifractals in each type of EEG signals, and three powerful statistical tests are performed to check existence of differences between estimated GHEs from healthy control subjects and epileptic patients. The obtained results show that multifractals exist in both types of EEG signals. Particularly, it was found that the degree of fractal is more pronounced in short variations of normal EEG signals than in short variations of EEG signals with seizure free intervals. In contrary, it is more pronounced in long variations of EEG signals with seizure free intervals than in normal EEG signals. Importantly, both parametric and nonparametric statistical tests show strong evidence that estimated GHEs of normal EEG signals are statistically and significantly different from those with seizure free intervals. Therefore, GHEs can be efficiently used to distinguish between healthy and patients suffering from epilepsy.
ERIC Educational Resources Information Center
Lee, Young-Sun; Wollack, James A.; Douglas, Jeffrey
2009-01-01
The purpose of this study was to assess the model fit of a 2PL through comparison with the nonparametric item characteristic curve (ICC) estimation procedures. Results indicate that three nonparametric procedures implemented produced ICCs that are similar to that of the 2PL for items simulated to fit the 2PL. However for misfitting items,…
Fundamentals of Research Data and Variables: The Devil Is in the Details.
Vetter, Thomas R
2017-10-01
Designing, conducting, analyzing, reporting, and interpreting the findings of a research study require an understanding of the types and characteristics of data and variables. Descriptive statistics are typically used simply to calculate, describe, and summarize the collected research data in a logical, meaningful, and efficient way. Inferential statistics allow researchers to make a valid estimate of the association between an intervention and the treatment effect in a specific population, based upon their randomly collected, representative sample data. Categorical data can be either dichotomous or polytomous. Dichotomous data have only 2 categories, and thus are considered binary. Polytomous data have more than 2 categories. Unlike dichotomous and polytomous data, ordinal data are rank ordered, typically based on a numerical scale that is comprised of a small set of discrete classes or integers. Continuous data are measured on a continuum and can have any numeric value over this continuous range. Continuous data can be meaningfully divided into smaller and smaller or finer and finer increments, depending upon the precision of the measurement instrument. Interval data are a form of continuous data in which equal intervals represent equal differences in the property being measured. Ratio data are another form of continuous data, which have the same properties as interval data, plus a true definition of an absolute zero point, and the ratios of the values on the measurement scale make sense. The normal (Gaussian) distribution ("bell-shaped curve") is of the most common statistical distributions. Many applied inferential statistical tests are predicated on the assumption that the analyzed data follow a normal distribution. The histogram and the Q-Q plot are 2 graphical methods to assess if a set of data have a normal distribution (display "normality"). The Shapiro-Wilk test and the Kolmogorov-Smirnov test are 2 well-known and historically widely applied quantitative methods to assess for data normality. Parametric statistical tests make certain assumptions about the characteristics and/or parameters of the underlying population distribution upon which the test is based, whereas nonparametric tests make fewer or less rigorous assumptions. If the normality test concludes that the study data deviate significantly from a Gaussian distribution, rather than applying a less robust nonparametric test, the problem can potentially be remedied by judiciously and openly: (1) performing a data transformation of all the data values; or (2) eliminating any obvious data outlier(s).
A PDF-based classification of gait cadence patterns in patients with amyotrophic lateral sclerosis.
Wu, Yunfeng; Ng, Sin Chun
2010-01-01
Amyotrophic lateral sclerosis (ALS) is a type of neurological disease due to the degeneration of motor neurons. During the course of such a progressive disease, it would be difficult for ALS patients to regulate normal locomotion, so that the gait stability becomes perturbed. This paper presents a pilot statistical study on the gait cadence (or stride interval) in ALS, based on the statistical analysis method. The probability density functions (PDFs) of stride interval were first estimated with the nonparametric Parzen-window method. We computed the mean of the left-foot stride interval and the modified Kullback-Leibler divergence (MKLD) from the PDFs estimated. The analysis results suggested that both of these two statistical parameters were significantly altered in ALS, and the least-squares support vector machine (LS-SVM) may effectively distinguish the stride patterns between the ALS patients and healthy controls, with an accurate rate of 82.8% and an area of 0.87 under the receiver operating characteristic curve.
Statistical Computations Underlying the Dynamics of Memory Updating
Gershman, Samuel J.; Radulescu, Angela; Norman, Kenneth A.; Niv, Yael
2014-01-01
Psychophysical and neurophysiological studies have suggested that memory is not simply a carbon copy of our experience: Memories are modified or new memories are formed depending on the dynamic structure of our experience, and specifically, on how gradually or abruptly the world changes. We present a statistical theory of memory formation in a dynamic environment, based on a nonparametric generalization of the switching Kalman filter. We show that this theory can qualitatively account for several psychophysical and neural phenomena, and present results of a new visual memory experiment aimed at testing the theory directly. Our experimental findings suggest that humans can use temporal discontinuities in the structure of the environment to determine when to form new memory traces. The statistical perspective we offer provides a coherent account of the conditions under which new experience is integrated into an old memory versus forming a new memory, and shows that memory formation depends on inferences about the underlying structure of our experience. PMID:25375816
Changing world extreme temperature statistics
NASA Astrophysics Data System (ADS)
Finkel, J. M.; Katz, J. I.
2018-04-01
We use the Global Historical Climatology Network--daily database to calculate a nonparametric statistic that describes the rate at which all-time daily high and low temperature records have been set in nine geographic regions (continents or major portions of continents) during periods mostly from the mid-20th Century to the present. This statistic was defined in our earlier work on temperature records in the 48 contiguous United States. In contrast to this earlier work, we find that in every region except North America all-time high records were set at a rate significantly (at least $3\\sigma$) higher than in the null hypothesis of a stationary climate. Except in Antarctica, all-time low records were set at a rate significantly lower than in the null hypothesis. In Europe, North Africa and North Asia the rate of setting new all-time highs increased suddenly in the 1990's, suggesting a change in regional climate regime; in most other regions there was a steadier increase.
Ye, Xin; Pendyala, Ram M.; Zou, Yajie
2017-01-01
A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences. PMID:29073152
Wang, Ke; Ye, Xin; Pendyala, Ram M; Zou, Yajie
2017-01-01
A semi-nonparametric generalized multinomial logit model, formulated using orthonormal Legendre polynomials to extend the standard Gumbel distribution, is presented in this paper. The resulting semi-nonparametric function can represent a probability density function for a large family of multimodal distributions. The model has a closed-form log-likelihood function that facilitates model estimation. The proposed method is applied to model commute mode choice among four alternatives (auto, transit, bicycle and walk) using travel behavior data from Argau, Switzerland. Comparisons between the multinomial logit model and the proposed semi-nonparametric model show that violations of the standard Gumbel distribution assumption lead to considerable inconsistency in parameter estimates and model inferences.
Mathematical models for nonparametric inferences from line transect data
Burnham, K.P.; Anderson, D.R.
1976-01-01
A general mathematical theory of line transects is develoepd which supplies a framework for nonparametric density estimation based on either right angle or sighting distances. The probability of observing a point given its right angle distance (y) from the line is generalized to an arbitrary function g(y). Given only that g(O) = 1, it is shown there are nonparametric approaches to density estimation using the observed right angle distances. The model is then generalized to include sighting distances (r). Let f(y/r) be the conditional distribution of right angle distance given sighting distance. It is shown that nonparametric estimation based only on sighting distances requires we know the transformation of r given by f(O/r).
Evaluating the statistical methodology of randomized trials on dentin hypersensitivity management.
Matranga, Domenica; Matera, Federico; Pizzo, Giuseppe
2017-12-27
The present study aimed to evaluate the characteristics and quality of statistical methodology used in clinical studies on dentin hypersensitivity management. An electronic search was performed for data published from 2009 to 2014 by using PubMed, Ovid/MEDLINE, and Cochrane Library databases. The primary search terms were used in combination. Eligibility criteria included randomized clinical trials that evaluated the efficacy of desensitizing agents in terms of reducing dentin hypersensitivity. A total of 40 studies were considered eligible for assessment of quality statistical methodology. The four main concerns identified were i) use of nonparametric tests in the presence of large samples, coupled with lack of information about normality and equality of variances of the response; ii) lack of P-value adjustment for multiple comparisons; iii) failure to account for interactions between treatment and follow-up time; and iv) no information about the number of teeth examined per patient and the consequent lack of cluster-specific approach in data analysis. Owing to these concerns, statistical methodology was judged as inappropriate in 77.1% of the 35 studies that used parametric methods. Additional studies with appropriate statistical analysis are required to obtain appropriate assessment of the efficacy of desensitizing agents.
Mapping Quantitative Traits in Unselected Families: Algorithms and Examples
Dupuis, Josée; Shi, Jianxin; Manning, Alisa K.; Benjamin, Emelia J.; Meigs, James B.; Cupples, L. Adrienne; Siegmund, David
2009-01-01
Linkage analysis has been widely used to identify from family data genetic variants influencing quantitative traits. Common approaches have both strengths and limitations. Likelihood ratio tests typically computed in variance component analysis can accommodate large families but are highly sensitive to departure from normality assumptions. Regression-based approaches are more robust but their use has primarily been restricted to nuclear families. In this paper, we develop methods for mapping quantitative traits in moderately large pedigrees. Our methods are based on the score statistic which in contrast to the likelihood ratio statistic, can use nonparametric estimators of variability to achieve robustness of the false positive rate against departures from the hypothesized phenotypic model. Because the score statistic is easier to calculate than the likelihood ratio statistic, our basic mapping methods utilize relatively simple computer code that performs statistical analysis on output from any program that computes estimates of identity-by-descent. This simplicity also permits development and evaluation of methods to deal with multivariate and ordinal phenotypes, and with gene-gene and gene-environment interaction. We demonstrate our methods on simulated data and on fasting insulin, a quantitative trait measured in the Framingham Heart Study. PMID:19278016
The analysis of professional competencies of a lecturer in adult education.
Žeravíková, Iveta; Tirpáková, Anna; Markechová, Dagmar
2015-01-01
In this article, we present the andragogical research project and evaluation of its results using nonparametric statistical methods and the semantic differential method. The presented research was realized in the years 2012-2013 in the dissertation of I. Žeravíková: Analysis of professional competencies of lecturer and creating his competence profile (Žeravíková 2013), and its purpose was based on the analysis of work activities of a lecturer to identify his most important professional competencies and to create a suggestion of competence profile of a lecturer in adult education.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property.
Storlie, Curtis B; Bondell, Howard D; Reich, Brian J; Zhang, Hao Helen
2011-04-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting.
Surface Estimation, Variable Selection, and the Nonparametric Oracle Property
Storlie, Curtis B.; Bondell, Howard D.; Reich, Brian J.; Zhang, Hao Helen
2010-01-01
Variable selection for multivariate nonparametric regression is an important, yet challenging, problem due, in part, to the infinite dimensionality of the function space. An ideal selection procedure should be automatic, stable, easy to use, and have desirable asymptotic properties. In particular, we define a selection procedure to be nonparametric oracle (np-oracle) if it consistently selects the correct subset of predictors and at the same time estimates the smooth surface at the optimal nonparametric rate, as the sample size goes to infinity. In this paper, we propose a model selection procedure for nonparametric models, and explore the conditions under which the new method enjoys the aforementioned properties. Developed in the framework of smoothing spline ANOVA, our estimator is obtained via solving a regularization problem with a novel adaptive penalty on the sum of functional component norms. Theoretical properties of the new estimator are established. Additionally, numerous simulated and real examples further demonstrate that the new approach substantially outperforms other existing methods in the finite sample setting. PMID:21603586
Privacy-preserving Kruskal-Wallis test.
Guo, Suxin; Zhong, Sheng; Zhang, Aidong
2013-10-01
Statistical tests are powerful tools for data analysis. Kruskal-Wallis test is a non-parametric statistical test that evaluates whether two or more samples are drawn from the same distribution. It is commonly used in various areas. But sometimes, the use of the method is impeded by privacy issues raised in fields such as biomedical research and clinical data analysis because of the confidential information contained in the data. In this work, we give a privacy-preserving solution for the Kruskal-Wallis test which enables two or more parties to coordinately perform the test on the union of their data without compromising their data privacy. To the best of our knowledge, this is the first work that solves the privacy issues in the use of the Kruskal-Wallis test on distributed data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Characterizing chaotic melodies in automatic music composition
NASA Astrophysics Data System (ADS)
Coca, Andrés E.; Tost, Gerard O.; Zhao, Liang
2010-09-01
In this paper, we initially present an algorithm for automatic composition of melodies using chaotic dynamical systems. Afterward, we characterize chaotic music in a comprehensive way as comprising three perspectives: musical discrimination, dynamical influence on musical features, and musical perception. With respect to the first perspective, the coherence between generated chaotic melodies (continuous as well as discrete chaotic melodies) and a set of classical reference melodies is characterized by statistical descriptors and melodic measures. The significant differences among the three types of melodies are determined by discriminant analysis. Regarding the second perspective, the influence of dynamical features of chaotic attractors, e.g., Lyapunov exponent, Hurst coefficient, and correlation dimension, on melodic features is determined by canonical correlation analysis. The last perspective is related to perception of originality, complexity, and degree of melodiousness (Euler's gradus suavitatis) of chaotic and classical melodies by nonparametric statistical tests.
On an Additive Semigraphoid Model for Statistical Networks With Application to Pathway Analysis.
Li, Bing; Chun, Hyonho; Zhao, Hongyu
2014-09-01
We introduce a nonparametric method for estimating non-gaussian graphical models based on a new statistical relation called additive conditional independence, which is a three-way relation among random vectors that resembles the logical structure of conditional independence. Additive conditional independence allows us to use one-dimensional kernel regardless of the dimension of the graph, which not only avoids the curse of dimensionality but also simplifies computation. It also gives rise to a parallel structure to the gaussian graphical model that replaces the precision matrix by an additive precision operator. The estimators derived from additive conditional independence cover the recently introduced nonparanormal graphical model as a special case, but outperform it when the gaussian copula assumption is violated. We compare the new method with existing ones by simulations and in genetic pathway analysis.
Clinical competence of Guatemalan and Mexican physicians for family dysfunction management.
Cabrera-Pivaral, Carlos Enrique; Orozco-Valerio, María de Jesús; Celis-de la Rosa, Alfredo; Covarrubias-Bermúdez, María de Los Ángeles; Zavala-González, Marco Antonio
2017-01-01
To evaluate the clinical competence of Mexican and Guatemalan physicians to management the family dysfunction. Cross comparative study in four care units first in Guadalajara, Mexico, and four in Guatemala, Guatemala, based on a purposeful sampling, involving 117 and 100 physicians, respectively. Clinical competence evaluated by validated instrument integrated for 187 items. Non-parametric descriptive and inferential statistical analysis was performed. The percentage of Mexican physicians with high clinical competence was 13.7%, medium 53%, low 24.8% and defined by random 8.5%. For the Guatemalan physicians'14% was high, average 63%, and 23% defined by random. There were no statistically significant differences between healthcare country units, but between the medium of Mexicans (0.55) and Guatemalans (0.55) (p = 0.02). The proportion of the high clinical competency of Mexican physicians' was as Guatemalans.
Nonparametric analysis of Minnesota spruce and aspen tree data and LANDSAT data
NASA Technical Reports Server (NTRS)
Scott, D. W.; Jee, R.
1984-01-01
The application of nonparametric methods in data-intensive problems faced by NASA is described. The theoretical development of efficient multivariate density estimators and the novel use of color graphics workstations are reviewed. The use of nonparametric density estimates for data representation and for Bayesian classification are described and illustrated. Progress in building a data analysis system in a workstation environment is reviewed and preliminary runs presented.
ERIC Educational Resources Information Center
Sueiro, Manuel J.; Abad, Francisco J.
2011-01-01
The distance between nonparametric and parametric item characteristic curves has been proposed as an index of goodness of fit in item response theory in the form of a root integrated squared error index. This article proposes to use the posterior distribution of the latent trait as the nonparametric model and compares the performance of an index…
Narayanan, Roshni; Nugent, Rebecca; Nugent, Kenneth
2015-10-01
Accreditation Council for Graduate Medical Education guidelines require internal medicine residents to develop skills in the interpretation of medical literature and to understand the principles of research. A necessary component is the ability to understand the statistical methods used and their results, material that is not an in-depth focus of most medical school curricula and residency programs. Given the breadth and depth of the current medical literature and an increasing emphasis on complex, sophisticated statistical analyses, the statistical foundation and education necessary for residents are uncertain. We reviewed the statistical methods and terms used in 49 articles discussed at the journal club in the Department of Internal Medicine residency program at Texas Tech University between January 1, 2013 and June 30, 2013. We collected information on the study type and on the statistical methods used for summarizing and comparing samples, determining the relations between independent variables and dependent variables, and estimating models. We then identified the typical statistics education level at which each term or method is learned. A total of 14 articles came from the Journal of the American Medical Association Internal Medicine, 11 from the New England Journal of Medicine, 6 from the Annals of Internal Medicine, 5 from the Journal of the American Medical Association, and 13 from other journals. Twenty reported randomized controlled trials. Summary statistics included mean values (39 articles), category counts (38), and medians (28). Group comparisons were based on t tests (14 articles), χ2 tests (21), and nonparametric ranking tests (10). The relations between dependent and independent variables were analyzed with simple regression (6 articles), multivariate regression (11), and logistic regression (8). Nine studies reported odds ratios with 95% confidence intervals, and seven analyzed test performance using sensitivity and specificity calculations. These papers used 128 statistical terms and context-defined concepts, including some from data analysis (56), epidemiology-biostatistics (31), modeling (24), data collection (12), and meta-analysis (5). Ten different software programs were used in these articles. Based on usual undergraduate and graduate statistics curricula, 64.3% of the concepts and methods used in these papers required at least a master's degree-level statistics education. The interpretation of the current medical literature can require an extensive background in statistical methods at an education level exceeding the material and resources provided to most medical students and residents. Given the complexity and time pressure of medical education, these deficiencies will be hard to correct, but this project can serve as a basis for developing a curriculum in study design and statistical methods needed by physicians-in-training.
Pfeifer, Roman; Schick, Sylvia; Holzmann, Christopher; Graw, Matthias; Teuben, Michel; Pape, Hans-Christoph
2017-12-01
Despite improvements in prevention and rescue, mortality rates after severe blunt trauma continue to be a problem. The present study analyses mortality patterns in a representative blunt trauma population, specifically the influence of demographic, injury pattern, location and timing of death. Patients that died between 1 January 2004 and 31 December 2005 were subjected to a standardised autopsy. death from blunt trauma due to road traffic injuries (Injury Severity Score ≥ 16), patients from a defined geographical area and death on scene or in hospital. suicide, homicide, penetrating trauma and monotrauma including isolated head injury. Statistical analyses included Student's t test (parametric), Mann-Whitney U test (nonparametric) or Chi-square test. A total of 277 consecutive injured patients were included in this study (mean age 46.1 ± 23 years; 67.5% males), 40.5% of which had an ISS of 75. A unimodal distribution of mortality was observed in blunt trauma patients. The most frequently injured body regions with the highest severity were the head (38.6%), chest (26.7%), or both head and chest (11.0%). The cumulative analysis of mortality showed that several factors, such as injury pattern and regional location of collisions, also affected the pattern of mortality. The majority of patients died on scene from severe head and thoracic injuries. A homogenous distribution of death was observed after an initial peak of death on scene. Moreover, several factors such as injury pattern and regional location of collisions may also affect the pattern of mortality.
Snyder, Marcía N; Henderson, W Matthew; Glinski, Donna A; Purucker, S Thomas
2017-01-01
The objective of the current study was to use a biomarker-based approach to investigate the influence of atrazine exposure on American toad (Anaxyrus americanus) and grey tree frog (Hyla versicolor) tadpoles. Atrazine is one of the most frequently detected herbicides in environmental matrices throughout the United States. In surface waters, it has been found at concentrations from 0.04-2859μg/L and thus presents a likely exposure scenario for non-target species such as amphibians. Studies have examined the effect of atrazine on the metamorphic parameters of amphibians, however, the data are often contradictory. Gosner stage 22-24 tadpoles were exposed to 0 (control), 10, 50, 250 or 1250μg/L of atrazine for 48h. Endogenous polar metabolites were extracted and analyzed using gas chromatography coupled with mass spectrometry. Statistical analyses of the acquired spectra with machine learning classification models demonstrated identifiable changes in the metabolomic profiles between exposed and control tadpoles. Support vector machine models with recursive feature elimination created a more efficient, non-parametric data analysis and increased interpretability of metabolomic profiles. Biochemical fluxes observed in the exposed groups of both A. americanus and H. versicolor displayed perturbations in a number of classes of biological macromolecules including fatty acids, amino acids, purine nucleosides, pyrimidines, and mono- and di-saccharides. Metabolomic pathway analyses are consistent with findings of other studies demonstrating disruption of amino acid and energy metabolism from atrazine exposure to non-target species. Copyright © 2016. Published by Elsevier B.V.
Parametric modelling of cost data in medical studies.
Nixon, R M; Thompson, S G
2004-04-30
The cost of medical resources used is often recorded for each patient in clinical studies in order to inform decision-making. Although cost data are generally skewed to the right, interest is in making inferences about the population mean cost. Common methods for non-normal data, such as data transformation, assuming asymptotic normality of the sample mean or non-parametric bootstrapping, are not ideal. This paper describes possible parametric models for analysing cost data. Four example data sets are considered, which have different sample sizes and degrees of skewness. Normal, gamma, log-normal, and log-logistic distributions are fitted, together with three-parameter versions of the latter three distributions. Maximum likelihood estimates of the population mean are found; confidence intervals are derived by a parametric BC(a) bootstrap and checked by MCMC methods. Differences between model fits and inferences are explored.Skewed parametric distributions fit cost data better than the normal distribution, and should in principle be preferred for estimating the population mean cost. However for some data sets, we find that models that fit badly can give similar inferences to those that fit well. Conversely, particularly when sample sizes are not large, different parametric models that fit the data equally well can lead to substantially different inferences. We conclude that inferences are sensitive to choice of statistical model, which itself can remain uncertain unless there is enough data to model the tail of the distribution accurately. Investigating the sensitivity of conclusions to choice of model should thus be an essential component of analysing cost data in practice. Copyright 2004 John Wiley & Sons, Ltd.
Biomarker analysis of American toad (Anaxyrus americanus) ...
The objective of the current study was to use a biomarker-based approach to investigate the influence of atrazine exposure on American toad (Anaxyrus americanus) and grey tree frog (Hyla versicolor) tadpoles. Atrazine is one of the most frequently detected herbicides in environmental matrices throughout the United States. In surface waters, it has been found at concentrations from 0.04–2859 μg/L and thus presents a likely exposure scenario for non-target species such as amphibians. Studies have examined the effect of atrazine on the metamorphic parameters of amphibians, however, the data are often contradictory. Gosner stage 22–24 tadpoles were exposed to 0 (control), 10, 50, 250 or 1250 μg/L of atrazine for 48 h. Endogenous polar metabolites were extracted and analyzed using gas chromatography coupled with mass spectrometry. Statistical analyses of the acquired spectra with machine learning classification models demonstrated identifiable changes in the metabolomic profiles between exposed and control tadpoles. Support vector machine models with recursive feature elimination created a more efficient, non-parametric data analysis and increased interpretability of metabolomic profiles. Biochemical fluxes observed in the exposed groups of both A. americanus and H. versicolor displayed perturbations in a number of classes of biological macromolecules including fatty acids, amino acids, purine nucleosides, pyrimidines, and mono- and di-saccharides. Metabolomic
McCurtin, Arlene; Healy, Chiara
2017-02-01
Speech-language pathologists (SLPs) are assumed to use evidence-based practice to inform treatment decisions. However, the reasoning underpinning treatment selections is not well known. Understanding why SLPs choose the treatments they do may be clarified by exploring the reasoning tied to specific treatments such as dysphagia interventions. An electronic survey methodology was utilised. Participants were accessed via the gatekeepers of two national dysphagia special interest groups representing adult and paediatric populations. Information was elicited on the dysphagia therapies and techniques used and on the reasoning for using/not using therapies. Data was analysed using descriptive and non-parametric statistics. The survey had a 74.8% response rate (n = 116). Consensus in both treatment selections and reasoning supporting treatment decisions was evident. Three favoured interventions (texture modification, thickening liquids, positioning changes) were identified. The reasoning supporting treatment choices centred primarily on client suitability and clinician knowledge. Knowledge reflected both absent knowledge (e.g. training) and accumulated knowledge (clinical experience). Dysphagia practice appears highly-defined, being characterised by group consensus regarding both preferred treatments and the reasoning underpinning treatment selections. Treatment selections are based on two core criteria: client suitability and the SLPs experience/knowledge. Explicit scientific reasoning is less influential than practice-centric influences.
Ajzenberg, Henry; Newman, Paula; Harris, Gail-Anne; Cranston, Marnie; Boyd, J Gordon
2018-02-01
To reduce medication turnaround times during neurological emergencies, a multidisciplinary team developed a neurological emergency crash trolley in our intensive care unit. This trolley includes phenytoin, hypertonic saline and mannitol, as well as other equipment. The aim of this study was to assess whether the cart reduced turnaround times for these medications. In this retrospective cohort study, medication delivery times for two year epochs before and after its implementation were compared. Eligible patients were identified from our intensive care unit screening log. Adults who required emergent use of phenytoin, hypertonic saline or mannitol while in the intensive care unit were included. Groups were compared with nonparametric analyses. 33-bed general medical-surgical intensive care unit in an academic teaching hospital. Time to medication administration. In the pre-intervention group, there were 43 patients with 66 events. In the post-intervention group, there were 45 patients with 80 events. The median medication turnaround time was significantly reduced after implementation of the neurological emergency trolley (25 vs. 10minutes, p=0.003). There was no statistically significant difference in intensive care or 30-day survival between the two cohorts. The implementation of a novel neurological emergency crash trolley in our intensive care unit reduced medication turnaround times. Copyright © 2017 Elsevier Ltd. All rights reserved.
Santana-Sagredo, Francisca; Uribe, Mauricio; Herrera, María José; Retamal, Rodrigo; Flores, Sergio
2015-12-01
The goal of this research is to understand the relevance of diet diversity during the transition to agriculture, in ancient populations from northern Chile, especially considering the significance of marine resources and crops in a lesser degree. A total of 14 human individuals were sampled from the Tarapacá 40 cemetery. Both bone and tooth samples were collected. Samples were studied from bone/dentine collagen for carbon and nitrogen isotopic analysis; and bone/enamel apatite for carbon isotope analysis. Inferential statistical analyses were performed in order to compare Tarapacá 40 stable carbon and nitrogen isotope values with other Formative and Late Intermediate Period groups. A nonparametrical hypothesis Kruskal-Wallis test was used. The results show that the individuals from Tarapacá 40 are intermediate to the values observed for terrestrial and marine fauna as well as C3 and C4 plants. A gradual transition to crop consumption, especially maize, is suggested. This complemented the earlier hunter-gatherer tradition of marine resources and wild fruit consumption. Contrarily to the predictions made by some archaeologists, the results obtained for northern Chile contrast with the classical perspective of a "Neolithic Revolution" in which transition to agriculture occurred more abruptly and linearly. © 2015 Wiley Periodicals, Inc.
Potential impacts of climate change on water quality in a shallow reservoir in China.
Zhang, Chen; Lai, Shiyu; Gao, Xueping; Xu, Liping
2015-10-01
To study the potential effects of climate change on water quality in a shallow reservoir in China, the field data analysis method is applied to data collected over a given monitoring period. Nine water quality parameters (water temperature, ammonia nitrogen, nitrate nitrogen, nitrite nitrogen, total nitrogen, total phosphorus, chemical oxygen demand, biochemical oxygen demand and dissolved oxygen) and three climate indicators for 20 years (1992-2011) are considered. The annual trends exhibit significant trends with respect to certain water quality and climate parameters. Five parameters exhibit significant seasonality differences in the monthly means between the two decades (1992-2001 and 2002-2011) of the monitoring period. Non-parametric regression of the statistical analyses is performed to explore potential key climate drivers of water quality in the reservoir. The results indicate that seasonal changes in temperature and rainfall may have positive impacts on water quality. However, an extremely cold spring and high wind speed are likely to affect the self-stabilising equilibrium states of the reservoir, which requires attention in the future. The results suggest that land use changes have important impact on nitrogen load. This study provides useful information regarding the potential effects of climate change on water quality in developing countries.
Martin, Roy C; Okonkwo, Ozioma C; Hill, Joni; Griffith, H Randall; Triebel, Kristen; Bartolucci, Alfred; Nicholas, Anthony P; Watts, Ray L; Stover, Natividad; Harrell, Lindy E; Clark, David; Marson, Daniel C
2008-10-15
Little is currently known about the higher order functional skills of patients with Parkinson disease and cognitive impairment. Medical decision-making capacity (MDC) was assessed in patients with Parkinson's disease (PD) with cognitive impairment and dementia. Participants were 16 patients with PD and cognitive impairment without dementia (PD-CIND), 16 patients with PD dementia (PDD), and 22 healthy older adults. All participants were administered the Capacity to Consent to Treatment Instrument (CCTI), a standardized capacity instrument assessing MDC under five different consent standards. Parametric and nonparametric statistical analyses were utilized to examine capacity performance on the consent standards. In addition, capacity outcomes (capable, marginally capable, or incapable outcomes) on the standards were identified for the two patient groups. Relative to controls, PD-CIND patients demonstrated significant impairment on the understanding treatment consent standard, clinically the most stringent CCTI standard. Relative to controls and PD-CIND patients, PDD patients were impaired on the three clinical standards of understanding, reasoning, and appreciation. The findings suggest that impairment in decisional capacity is already present in cognitively impaired patients with PD without dementia and increases as these patients develop dementia. Clinicians and researchers should carefully assess decisional capacity in all patients with PD with cognitive impairment. (c) 2008 Movement Disorder Society.
Monezi, Lucas Antônio; Magalhães, Thiago Pinguelli; Morato, Márcio Pereira; Mercadante, Luciano Allegretti; Furtado, Otávio Luis Piva da Cunha; Misuta, Milton Shoiti
2018-03-26
In this study, we aimed to analyse goalball players time-motion variables (distance covered, time spent, maximum and average velocities) in official goalball match attacks, taking into account the attack phases (preparation and throwing), player position (centres and wings) and throwing techniques (frontal, spin and between the legs). A total of 365 attacks were assessed using a video based method (2D) through manual tracking using the Dvideo system. Inferential non-parametric statistics were applied for comparison of preparation vs. throwing phase, wings vs. centres and, among the throwing techniques, frontal, spin and between the legs. Significant differences were found between the attack preparation versus the throwing phase for all player time-motion variables: distance covered, time spent, maximum player velocity and average player velocity. Wing players performed most of the throws (85%) and covered longer distances than centres (1.65 vs 0.31 m). The between the legs and the spin throwing techniques presented greater values for most of the time-motion variables (distance covered, time spent and maximum player velocity) than did the frontal technique in both attack phases. These findings provide important information regarding players' movement patterns during goalball matches that can be used to plan more effective training.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mizell, Steve A.; Shadel, Craig A.
Airborne particulates are collected at U.S. Department of Energy sites that exhibit radiological contamination on the soil surface to help assess the potential for wind to transport radionuclides from the contamination sites. Collecting these samples was originally accomplished by drawing air through a cellulose-fiber filter. These filters were replaced with glass-fiber filters in March 2011. Airborne particulates were collected side by side on the two filter materials between May 2013 and May 2014. Comparisons of the sample mass and the radioactivity determinations for the side-by-side samples were undertaken to determine if the change in the filter medium produced significant results.more » The differences in the results obtained using the two filter types were assessed visually by evaluating the time series and correlation plots and statistically by conducting a nonparametric matched-pair sign test. Generally, the glass-fiber filters collect larger samples of particulates and produce higher radioactivity values for the gross alpha, gross beta, and gamma spectroscopy analyses. However, the correlation between the radioanalytical results for the glass-fiber filters and the cellulose-fiber filters was not strong enough to generate a linear regression function to estimate the glass-fiber filter sample results from the cellulose-fiber filter sample results.« less
Intensive care nurses' knowledge of enteral nutrition: A descriptive questionnaire.
Morphet, Julia; Clarke, Angelique B; Bloomer, Melissa J
2016-12-01
Nurses have an important role in the delivery and management of enteral nutrition in critically ill patients, to prevent iatrogenic malnutrition. It is not clear how nurses source enteral nutrition information. This study aimed to explore Australian nurses' enteral nutrition knowledge and sources of information. Data were collected from members of the Australian College of Critical Care Nurses in May 2014 using an online questionnaire. A combination of descriptive statistics and non-parametric analyses were undertaken to evaluate quantitative data. Content analysis was used to evaluate qualitative data. 359 responses were included in data analysis. All respondents were Registered Nurses with experience working in an Australian intensive care unit or high dependency unit. Most respondents reported their enteral nutrition knowledge was good (n=205, 60.1%) or excellent (n=35, 10.3%), but many lacked knowledge regarding the effect of malnutrition on patient outcomes. Dietitians and hospital protocols were the most valuable sources of enteral nutrition information, but were not consistently utilised. Significant knowledge deficits in relation to enteral nutrition were identified. Dietitians were the preferred source of nurses' enteral nutrition information, however their limited availability impacted their efficacy as an information resource. Educational opportunities for nurses need to be improved to enable appropriate nutritional care in critically ill patients. Copyright © 2016 Elsevier Ltd. All rights reserved.
Simulation-based sensitivity analysis for non-ignorably missing data.
Yin, Peng; Shi, Jian Q
2017-01-01
Sensitivity analysis is popular in dealing with missing data problems particularly for non-ignorable missingness, where full-likelihood method cannot be adopted. It analyses how sensitively the conclusions (output) may depend on assumptions or parameters (input) about missing data, i.e. missing data mechanism. We call models with the problem of uncertainty sensitivity models. To make conventional sensitivity analysis more useful in practice we need to define some simple and interpretable statistical quantities to assess the sensitivity models and make evidence based analysis. We propose a novel approach in this paper on attempting to investigate the possibility of each missing data mechanism model assumption, by comparing the simulated datasets from various MNAR models with the observed data non-parametrically, using the K-nearest-neighbour distances. Some asymptotic theory has also been provided. A key step of this method is to plug in a plausibility evaluation system towards each sensitivity parameter, to select plausible values and reject unlikely values, instead of considering all proposed values of sensitivity parameters as in the conventional sensitivity analysis method. The method is generic and has been applied successfully to several specific models in this paper including meta-analysis model with publication bias, analysis of incomplete longitudinal data and mean estimation with non-ignorable missing data.
Adhesion of Candida albicans to Vanillin Incorporated Self-Curing Orthodontic PMMA Resin.
NASA Astrophysics Data System (ADS)
Zam, K.; Sawaengkit, P.; Thaweboon, S.; Thaweboon, B.
2018-02-01
It has been observed that there is an increase in Candida carriers during the treatment with orthodontic removable appliance. Vanillin is flavouring agent, which is known to have antioxidant and antimicrobial properties. The aim of this study was to evaluate the effect of vanillin incorporated PMMA on adhesion of Candida albicans. A total of 36 orthodontic self-curing PMMA resin samples were fabricated. The samples were divided into 3 groups depending on percentage of vanillin incorporated (0.1%, 0.5% and PMMA without vanillin as control). PMMA samples were coated with saliva. The adhesion assay was performed with C. albicans (ATCC 10231). The adherent yeast cells were stained with crystal violet and counted under microscope by random selection of 3 fields at 10X magnification. The statistical analyses performed by Kruskal Wallis and Mann Whitney non-parametric test. It was found that the PMMA resin samples with vanillin incorporation significantly reduced the adhesion of C. albicans as compared to the control group. This study indicates that vanillin incorporated resin can impede the adhesion of C. albicans to about 45 - 56 %. With further testing and development, vanillin can be employed as an antifungal agent to prevent adhesion of C. albicans to orthodontic self-curing PMMA resin.