EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.
Tong, Xiaoxiao; Bentler, Peter M
2013-01-01
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Chou, C P; Bentler, P M; Satorra, A
1991-11-01
Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.
Estimating the proportion of true null hypotheses when the statistics are discrete.
Dialsingh, Isaac; Austin, Stefanie R; Altman, Naomi S
2015-07-15
In high-dimensional testing problems π0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating π0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems. This article introduces a number of π0 estimators, the regression and 'T' methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data. implemented in R. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A Model of Statistics Performance Based on Achievement Goal Theory.
ERIC Educational Resources Information Center
Bandalos, Deborah L.; Finney, Sara J.; Geske, Jenenne A.
2003-01-01
Tests a model of statistics performance based on achievement goal theory. Both learning and performance goals affected achievement indirectly through study strategies, self-efficacy, and test anxiety. Implications of these findings for teaching and learning statistics are discussed. (Contains 47 references, 3 tables, 3 figures, and 1 appendix.)…
Kiekkas, Panagiotis; Panagiotarou, Aliki; Malja, Alvaro; Tahirai, Daniela; Zykai, Rountina; Bakalis, Nick; Stefanopoulos, Nikolaos
2015-12-01
Although statistical knowledge and skills are necessary for promoting evidence-based practice, health sciences students have expressed anxiety about statistics courses, which may hinder their learning of statistical concepts. To evaluate the effects of a biostatistics course on nursing students' attitudes toward statistics and to explore the association between these attitudes and their performance in the course examination. One-group quasi-experimental pre-test/post-test design. Undergraduate nursing students of the fifth or higher semester of studies, who attended a biostatistics course. Participants were asked to complete the pre-test and post-test forms of The Survey of Attitudes Toward Statistics (SATS)-36 scale at the beginning and end of the course respectively. Pre-test and post-test scale scores were compared, while correlations between post-test scores and participants' examination performance were estimated. Among 156 participants, post-test scores of the overall SATS-36 scale and of the Affect, Cognitive Competence, Interest and Effort components were significantly higher than pre-test ones, indicating that the course was followed by more positive attitudes toward statistics. Among 104 students who participated in the examination, higher post-test scores of the overall SATS-36 scale and of the Affect, Difficulty, Interest and Effort components were significantly but weakly correlated with higher examination performance. Students' attitudes toward statistics can be improved through appropriate biostatistics courses, while positive attitudes contribute to higher course achievements and possibly to improved statistical skills in later professional life. Copyright © 2015 Elsevier Ltd. All rights reserved.
Tsui, Joanne M.; Mazzocco, Michèle M. M.
2009-01-01
This study was designed to examine the effects of math anxiety and perfectionism on math performance, under timed testing conditions, among mathematically gifted sixth graders. We found that participants had worse math performance during timed versus untimed testing, but this difference was statistically significant only when the timed condition preceded the untimed condition. We also found that children with higher levels of either math anxiety or perfectionism had a smaller performance discrepancy during timed versus untimed testing, relative to children with lower levels of math anxiety or perfectionism. There were no statistically significant gender differences in overall test performance, nor in levels of math anxiety or perfectionism; however, the difference between performance on timed and untimed math testing was statistically significant for girls, but not for boys. Implications for educators are discussed. PMID:20084180
Dymova, Natalya; Hanumara, R. Choudary; Gagnon, Ronald N.
2009-01-01
Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies. PMID:19696393
Dymova, Natalya; Hanumara, R Choudary; Enander, Richard T; Gagnon, Ronald N
2009-10-01
Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies.
Modified Distribution-Free Goodness-of-Fit Test Statistic.
Chun, So Yeon; Browne, Michael W; Shapiro, Alexander
2018-03-01
Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.
2009 GED Testing Program Statistical Report
ERIC Educational Resources Information Center
GED Testing Service, 2010
2010-01-01
The "2009 GED[R] Testing Program Statistical Report" is the 52nd annual report in the program's 68-year history of providing a second opportunity for adults without a high school credential to earn their jurisdiction's GED credential. The report provides candidate demographic and GED Test performance statistics as well as historical…
The Statistical Loop Analyzer (SLA)
NASA Technical Reports Server (NTRS)
Lindsey, W. C.
1985-01-01
The statistical loop analyzer (SLA) is designed to automatically measure the acquisition, tracking and frequency stability performance characteristics of symbol synchronizers, code synchronizers, carrier tracking loops, and coherent transponders. Automated phase lock and system level tests can also be made using the SLA. Standard baseband, carrier and spread spectrum modulation techniques can be accomodated. Through the SLA's phase error jitter and cycle slip measurements the acquisition and tracking thresholds of the unit under test are determined; any false phase and frequency lock events are statistically analyzed and reported in the SLA output in probabilistic terms. Automated signal drop out tests can be performed in order to trouble shoot algorithms and evaluate the reacquisition statistics of the unit under test. Cycle slip rates and cycle slip probabilities can be measured using the SLA. These measurements, combined with bit error probability measurements, are all that are needed to fully characterize the acquisition and tracking performance of a digital communication system.
Efficient statistical tests to compare Youden index: accounting for contingency correlation.
Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan
2015-04-30
Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].
Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel
2017-01-01
The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
1992-10-01
N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.
Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W
2018-05-18
Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis.
Lin, Johnny; Bentler, Peter M
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne's asymptotically distribution-free method and Satorra Bentler's mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler's statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby's study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic.
Bruner, L H; Carr, G J; Harbell, J W; Curren, R D
2002-06-01
An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
40 CFR 1065.12 - Approval of alternate procedures.
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine meets all applicable emission standards according to specified procedures. (iii) Use statistical.... (e) We may give you specific directions regarding methods for statistical analysis, or we may approve... statistical tests. Perform the tests as follows: (1) Repeat measurements for all applicable duty cycles at...
Evaluation of a New Mean Scaled and Moment Adjusted Test Statistic for SEM
ERIC Educational Resources Information Center
Tong, Xiaoxiao; Bentler, Peter M.
2013-01-01
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and 2 well-known robust test…
An Analysis of Effects of Variable Factors on Weapon Performance
1993-03-01
ALTERNATIVE ANALYSIS A. CATEGORICAL DATA ANALYSIS Statistical methodology for categorical data analysis traces its roots to the work of Francis Galton in the...choice of statistical tests . This thesis examines an analysis performed by Surface Warfare Development Group (SWDG). The SWDG analysis is shown to be...incorrect due to the misapplication of testing methods. A corrected analysis is presented and recommendations suggested for changes to the testing
2013-01-01
Background Cognitive complaints are reported frequently after breast cancer treatments. Their association with neuropsychological (NP) test performance is not well-established. Methods Early-stage, posttreatment breast cancer patients were enrolled in a prospective, longitudinal, cohort study prior to starting endocrine therapy. Evaluation included an NP test battery and self-report questionnaires assessing symptoms, including cognitive complaints. Multivariable regression models assessed associations among cognitive complaints, mood, treatment exposures, and NP test performance. Results One hundred eighty-nine breast cancer patients, aged 21–65 years, completed the evaluation; 23.3% endorsed higher memory complaints and 19.0% reported higher executive function complaints (>1 SD above the mean for healthy control sample). Regression modeling demonstrated a statistically significant association of higher memory complaints with combined chemotherapy and radiation treatments (P = .01), poorer NP verbal memory performance (P = .02), and higher depressive symptoms (P < .001), controlling for age and IQ. For executive functioning complaints, multivariable modeling controlling for age, IQ, and other confounds demonstrated statistically significant associations with better NP visual memory performance (P = .03) and higher depressive symptoms (P < .001), whereas combined chemotherapy and radiation treatment (P = .05) approached statistical significance. Conclusions About one in five post–adjuvant treatment breast cancer patients had elevated memory and/or executive function complaints that were statistically significantly associated with domain-specific NP test performances and depressive symptoms; combined chemotherapy and radiation treatment was also statistically significantly associated with memory complaints. These results and other emerging studies suggest that subjective cognitive complaints in part reflect objective NP performance, although their etiology and biology appear to be multifactorial, motivating further transdisciplinary research. PMID:23606729
DECIDE: a software for computer-assisted evaluation of diagnostic test performance.
Chiecchio, A; Bo, A; Manzone, P; Giglioli, F
1993-05-01
The evaluation of the performance of clinical tests is a complex problem involving different steps and many statistical tools, not always structured in an organic and rational system. This paper presents a software which provides an organic system of statistical tools helping evaluation of clinical test performance. The program allows (a) the building and the organization of a working database, (b) the selection of the minimal set of tests with the maximum information content, (c) the search of the model best fitting the distribution of the test values, (d) the selection of optimal diagnostic cut-off value of the test for every positive/negative situation, (e) the evaluation of performance of the combinations of correlated and uncorrelated tests. The uncertainty associated with all the variables involved is evaluated. The program works in a MS-DOS environment with EGA or higher performing graphic card.
A Third Moment Adjusted Test Statistic for Small Sample Factor Analysis
Lin, Johnny; Bentler, Peter M.
2012-01-01
Goodness of fit testing in factor analysis is based on the assumption that the test statistic is asymptotically chi-square; but this property may not hold in small samples even when the factors and errors are normally distributed in the population. Robust methods such as Browne’s asymptotically distribution-free method and Satorra Bentler’s mean scaling statistic were developed under the presumption of non-normality in the factors and errors. This paper finds new application to the case where factors and errors are normally distributed in the population but the skewness of the obtained test statistic is still high due to sampling error in the observed indicators. An extension of Satorra Bentler’s statistic is proposed that not only scales the mean but also adjusts the degrees of freedom based on the skewness of the obtained test statistic in order to improve its robustness under small samples. A simple simulation study shows that this third moment adjusted statistic asymptotically performs on par with previously proposed methods, and at a very small sample size offers superior Type I error rates under a properly specified model. Data from Mardia, Kent and Bibby’s study of students tested for their ability in five content areas that were either open or closed book were used to illustrate the real-world performance of this statistic. PMID:23144511
ERIC Educational Resources Information Center
Tabor, Josh
2010-01-01
On the 2009 AP[c] Statistics Exam, students were asked to create a statistic to measure skewness in a distribution. This paper explores several of the most popular student responses and evaluates which statistic performs best when sampling from various skewed populations. (Contains 8 figures, 3 tables, and 4 footnotes.)
Test Vehicle Forebody Wake Effects on CPAS Parachutes
NASA Technical Reports Server (NTRS)
Ray, Eric S.
2017-01-01
Parachute drag performance has been reconstructed for a large number of Capsule Parachute Assembly System (CPAS) flight tests. This allows for determining forebody wake effects indirectly through statistical means. When data are available in a "clean" wake, such as behind a slender test vehicle, the relative degradation in performance for other test vehicles can be computed as a Pressure Recovery Fraction (PRF). All four CPAS parachute types were evaluated: Forward Bay Cover Parachutes (FBCPs), Drogues, Pilots, and Mains. Many tests used the missile-shaped Parachute Compartment Drop Test Vehicle (PCDTV) to obtain data at high airspeeds. Other tests used the Orion "boilerplate" Parachute Test Vehicle (PTV) to evaluate parachute performance in a representative heatshield wake. Drag data from both vehicles are normalized to a "capsule" forebody equivalent for Orion simulations. A separate database of PCDTV-specific performance is maintained to accurately predict flight tests. Data are shared among analogous parachutes whenever possible to maximize statistical significance.
Gaus, Wilhelm
2014-09-02
The US National Toxicology Program (NTP) is assessed by a statistician. In the NTP-program groups of rodents are fed for a certain period of time with different doses of the substance that is being investigated. Then the animals are sacrificed and all organs are examined pathologically. Such an investigation facilitates many statistical tests. Technical Report TR 578 on Ginkgo biloba is used as an example. More than 4800 statistical tests are possible with the investigations performed. Due to a thought experiment we expect >240 false significant tests. In actuality, 209 significant pathological findings were reported. The readers of Toxicology Letters should carefully distinguish between confirmative and explorative statistics. A confirmative interpretation of a significant test rejects the null-hypothesis and delivers "statistical proof". It is only allowed if (i) a precise hypothesis was established independently from the data used for the test and (ii) the computed p-values are adjusted for multiple testing if more than one test was performed. Otherwise an explorative interpretation generates a hypothesis. We conclude that NTP-reports - including TR 578 on Ginkgo biloba - deliver explorative statistics, i.e. they generate hypotheses, but do not prove them. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Jeoung, Bogja
2017-01-01
The purpose of this study was to evaluate the relationship between sitting volleyball performance and the field fitness of sitting volleyball players. Forty-five elite sitting volleyball players participated in 10 field fitness tests. Additionally, the players’ head coach and coach assessed their volleyball performance (receive and defense, block, attack, and serve). Data were analyzed with SPSS software version 21 by using correlation and regression analyses, and the significance level was set at P< 0.05. The results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, splint, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on the players’ abilities to attack, serve, and block. Grip strength, t-test, speed, and agility showed a statistically significant relationship with the players’ skill at defense and receive. Our results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on volleyball performance. PMID:29326896
Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas
2011-01-01
Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556
Variability-aware compact modeling and statistical circuit validation on SRAM test array
NASA Astrophysics Data System (ADS)
Qiao, Ying; Spanos, Costas J.
2016-03-01
Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose a variability-aware compact model characterization methodology based on stepwise parameter selection. Transistor I-V measurements are obtained from bit transistor accessible SRAM test array fabricated using a collaborating foundry's 28nm FDSOI technology. Our in-house customized Monte Carlo simulation bench can incorporate these statistical compact models; and simulation results on SRAM writability performance are very close to measurements in distribution estimation. Our proposed statistical compact model parameter extraction methodology also has the potential of predicting non-Gaussian behavior in statistical circuit performances through mixtures of Gaussian distributions.
Analysis of Multiple Contingency Tables by Exact Conditional Tests for Zero Partial Association.
ERIC Educational Resources Information Center
Kreiner, Svend
The tests for zero partial association in a multiple contingency table have gained new importance with the introduction of graphical models. It is shown how these may be performed as exact conditional tests, using as test criteria either the ordinary likelihood ratio, the standard x squared statistic, or any other appropriate statistics. A…
How to Compare Parametric and Nonparametric Person-Fit Statistics Using Real Data
ERIC Educational Resources Information Center
Sinharay, Sandip
2017-01-01
Person-fit assessment (PFA) is concerned with uncovering atypical test performance as reflected in the pattern of scores on individual items on a test. Existing person-fit statistics (PFSs) include both parametric and nonparametric statistics. Comparison of PFSs has been a popular research topic in PFA, but almost all comparisons have employed…
Physique and Performance of Young Wheelchair Basketball Players in Relation with Classification
Zancanaro, Carlo
2015-01-01
The relationships among physical characteristics, performance, and functional ability classification of younger wheelchair basketball players have been barely investigated to date. The purpose of this work was to assess anthropometry, body composition, and performance in sport-specific field tests in a national sample of Italian younger wheelchair basketball players as well as to evaluate the association of these variables with the players’ functional ability classification and game-related statistics. Several anthropometric measurements were obtained for 52 out of 91 eligible players nationwide. Performance was assessed in seven sport-specific field tests (5m sprint, 20m sprint with ball, suicide, maximal pass, pass for accuracy, spot shot and lay-ups) and game-related statistics (free-throw points scored per match, two- and three-point field-goals scored per match, and their sum). Association between variables, and predictivity was assessed by correlation and regression analysis, respectively. Players were grouped into four Classes of increasing functional ability (A-D). One-way ANOVA with Bonferroni’s correction for multiple comparisons was used to assess differences between Classes. Sitting height and functional ability Class especially correlated with performance outcomes, but wheelchair basketball experience and skinfolds did not. Game-related statistics and sport-specific field-test scores all showed significant correlation with each other. Upper arm circumference and/or maximal pass and lay-ups test scores were able to explain 42 to 59% of variance in game-related statistics (P<0.001). A clear difference in performance was only found for functional ability Class A and D. Conclusion: In younger wheelchair basketball players, sitting height positively contributes to performance. The maximal pass and lay-ups test should be carefully considered in younger wheelchair basketball training plans. Functional ability Class reflects to a limited extent the actual differences in performance. PMID:26606681
Zhang, Fanghong; Miyaoka, Etsuo; Huang, Fuping; Tanaka, Yutaka
2015-01-01
The problem for establishing noninferiority is discussed between a new treatment and a standard (control) treatment with ordinal categorical data. A measure of treatment effect is used and a method of specifying noninferiority margin for the measure is provided. Two Z-type test statistics are proposed where the estimation of variance is constructed under the shifted null hypothesis using U-statistics. Furthermore, the confidence interval and the sample size formula are given based on the proposed test statistics. The proposed procedure is applied to a dataset from a clinical trial. A simulation study is conducted to compare the performance of the proposed test statistics with that of the existing ones, and the results show that the proposed test statistics are better in terms of the deviation from nominal level and the power.
Performance statistics of the FORTRAN 4 /H/ library for the IBM system/360
NASA Technical Reports Server (NTRS)
Clark, N. A.; Cody, W. J., Jr.; Hillstrom, K. E.; Thieleker, E. A.
1969-01-01
Test procedures and results for accuracy and timing tests of the basic IBM 360/50 FORTRAN 4 /H/ subroutine library are reported. The testing was undertaken to verify performance capability and as a prelude to providing some replacement routines of improved performance.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests.
Kosinski, Andrzej S
2013-03-15
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests
Kosinski, Andrzej S.
2013-01-01
Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
Sinharay, Sandip
2017-09-01
Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.
A new test of multivariate nonlinear causality
Bai, Zhidong; Jiang, Dandan; Lv, Zhihui; Wong, Wing-Keung; Zheng, Shurong
2018-01-01
The multivariate nonlinear Granger causality developed by Bai et al. (2010) (Mathematics and Computers in simulation. 2010; 81: 5-17) plays an important role in detecting the dynamic interrelationships between two groups of variables. Following the idea of Hiemstra-Jones (HJ) test proposed by Hiemstra and Jones (1994) (Journal of Finance. 1994; 49(5): 1639-1664), they attempt to establish a central limit theorem (CLT) of their test statistic by applying the asymptotical property of multivariate U-statistic. However, Bai et al. (2016) (2016; arXiv: 1701.03992) revisit the HJ test and find that the test statistic given by HJ is NOT a function of U-statistics which implies that the CLT neither proposed by Hiemstra and Jones (1994) nor the one extended by Bai et al. (2010) is valid for statistical inference. In this paper, we re-estimate the probabilities and reestablish the CLT of the new test statistic. Numerical simulation shows that our new estimates are consistent and our new test performs decent size and power. PMID:29304085
A new test of multivariate nonlinear causality.
Bai, Zhidong; Hui, Yongchang; Jiang, Dandan; Lv, Zhihui; Wong, Wing-Keung; Zheng, Shurong
2018-01-01
The multivariate nonlinear Granger causality developed by Bai et al. (2010) (Mathematics and Computers in simulation. 2010; 81: 5-17) plays an important role in detecting the dynamic interrelationships between two groups of variables. Following the idea of Hiemstra-Jones (HJ) test proposed by Hiemstra and Jones (1994) (Journal of Finance. 1994; 49(5): 1639-1664), they attempt to establish a central limit theorem (CLT) of their test statistic by applying the asymptotical property of multivariate U-statistic. However, Bai et al. (2016) (2016; arXiv: 1701.03992) revisit the HJ test and find that the test statistic given by HJ is NOT a function of U-statistics which implies that the CLT neither proposed by Hiemstra and Jones (1994) nor the one extended by Bai et al. (2010) is valid for statistical inference. In this paper, we re-estimate the probabilities and reestablish the CLT of the new test statistic. Numerical simulation shows that our new estimates are consistent and our new test performs decent size and power.
An Investigation of Dental Luting Cement Solubility as a Function of the Marginal Gap.
1988-05-01
way ANOVA for the Phase 1 Diffusion Study revealed that there were statistically significant differences between the test groups. A Duncan’s Multiple...cement. The 25, 50, and 75 micron groups demonstrated no statistically significant differences in the amount of remaining luting cement. ( p< 0.05) A...one-way ANOVA was also performed on Phase 2 Dynamic Study. This test revealed that there were statistically significant differences among the test
A Statistical Analysis Plan to Support the Joint Forward Area Air Defense Test.
1984-08-02
hy estahlishing a specific significance level prior to performing the statistical test (traditionally a levels are set at .01 or .05). What is often...undesirable increase in 8. For constant a levels , the power (I - 8) of a statistical test can he increased by Increasing the sample size of the test. fRef...ANOVA Iparison Test on MOP I=--ferences Exist AmongF "Upon MOP "A" Factor I "A" Factor I 1MOP " A " Levels ? I . I I I _ _ ________ IPerform k-Sample Com- I
Statistical EMC: A new dimension electromagnetic compatibility of digital electronic systems
NASA Astrophysics Data System (ADS)
Tsaliovich, Anatoly
Electromagnetic compatibility compliance test results are used as a database for addressing three classes of electromagnetic-compatibility (EMC) related problems: statistical EMC profiles of digital electronic systems, the effect of equipment-under-test (EUT) parameters on the electromagnetic emission characteristics, and EMC measurement specifics. Open area test site (OATS) and absorber line shielded room (AR) results are compared for equipment-under-test highest radiated emissions. The suggested statistical evaluation methodology can be utilized to correlate the results of different EMC test techniques, characterize the EMC performance of electronic systems and components, and develop recommendations for electronic product optimal EMC design.
Obuchowski, Nancy A; Buckler, Andrew; Kinahan, Paul; Chen-Mayer, Heather; Petrick, Nicholas; Barboriak, Daniel P; Bullen, Jennifer; Barnhart, Huiman; Sullivan, Daniel C
2016-04-01
A major initiative of the Quantitative Imaging Biomarker Alliance is to develop standards-based documents called "Profiles," which describe one or more technical performance claims for a given imaging modality. The term "actor" denotes any entity (device, software, or person) whose performance must meet certain specifications for the claim to be met. The objective of this paper is to present the statistical issues in testing actors' conformance with the specifications. In particular, we present the general rationale and interpretation of the claims, the minimum requirements for testing whether an actor achieves the performance requirements, the study designs used for testing conformity, and the statistical analysis plan. We use three examples to illustrate the process: apparent diffusion coefficient in solid tumors measured by MRI, change in Perc 15 as a biomarker for the progression of emphysema, and percent change in solid tumor volume by computed tomography as a biomarker for lung cancer progression. Copyright © 2016 The Association of University Radiologists. All rights reserved.
De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric
2010-01-11
Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
ERIC Educational Resources Information Center
Watson, Jane
2007-01-01
Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…
Autoregressive statistical pattern recognition algorithms for damage detection in civil structures
NASA Astrophysics Data System (ADS)
Yao, Ruigen; Pakzad, Shamim N.
2012-08-01
Statistical pattern recognition has recently emerged as a promising set of complementary methods to system identification for automatic structural damage assessment. Its essence is to use well-known concepts in statistics for boundary definition of different pattern classes, such as those for damaged and undamaged structures. In this paper, several statistical pattern recognition algorithms using autoregressive models, including statistical control charts and hypothesis testing, are reviewed as potentially competitive damage detection techniques. To enhance the performance of statistical methods, new feature extraction techniques using model spectra and residual autocorrelation, together with resampling-based threshold construction methods, are proposed. Subsequently, simulated acceleration data from a multi degree-of-freedom system is generated to test and compare the efficiency of the existing and proposed algorithms. Data from laboratory experiments conducted on a truss and a large-scale bridge slab model are then used to further validate the damage detection methods and demonstrate the superior performance of proposed algorithms.
ERIC Educational Resources Information Center
White, Desley
2015-01-01
Two practical activities are described, which aim to support critical thinking about statistics as they concern multiple outcomes testing. Formulae are presented in Microsoft Excel spreadsheets, which are used to calculate the inflation of error associated with the quantity of tests performed. This is followed by a decision-making exercise, where…
Effect of Kınesıotapıng and Knee Brace on Functıonal Performance in Recreatıonal Athletes
Ulusoy, Burak; İldiz, Bülent; Tunay, Volga Bayrakçı
2014-01-01
Objectives: Kinesiotaping is a popular taping method that is used for both therapeutic and performance enchancement purposes. Knee braces are widely used for prevention in sport injuries but their performance effectiveness is still controversial. The aim of this study was to determine whether kinesiotape or brace was more effective on functional performance. Methods: A total twenty male recreational football players (Mean±Standart Deviation (SD) age: 22.5±0.68 years, height: 175.15±3.37 cm, body weight: 74.52±12.41 kg), voluntarily participated in this study. Participants were tested with kinesiotape, with brace and without kinesiotape and brace. Tests were applied one day after patellar kinesiotaping (correction technique). Balance property measured with Modified Y balance Test (dynamic test), agility measured by T test, muscle strength and anaerobic power assessed by vertical jump and triple hop tests. Wilcoxon signed rank test was employed for determining the statistical significance of tests with kinesiotape, with brace and without kinesiotape and brace. Results: In analysis; There were statistically significant differences found in Triple hop test with kinesiotaping and without kinesiotaping and brace, in T test with bracing and kinesiotaping, in vertical jump with kinesiotaping and without kinesiotaping and brace (p<0.001) (in the favour of kinesiotaping in all tests) No statistically significant difference was found in modified Y balance test all groups (p> 0.05). Conclusion: Consequently, kinesiotaping had positive effects on agility and muscle strength but had no effects on balance in football players. On the other hand, brace had no effects on functional performance tests.
Sequi, Marco; Campi, Rita; Clavenna, Antonio; Bonati, Maurizio
2013-03-01
To evaluate the quality of data reporting and statistical methods performed in drug utilization studies in the pediatric population. Drug utilization studies evaluating all drug prescriptions to children and adolescents published between January 1994 and December 2011 were retrieved and analyzed. For each study, information on measures of exposure/consumption, the covariates considered, descriptive and inferential analyses, statistical tests, and methods of data reporting was extracted. An overall quality score was created for each study using a 12-item checklist that took into account the presence of outcome measures, covariates of measures, descriptive measures, statistical tests, and graphical representation. A total of 22 studies were reviewed and analyzed. Of these, 20 studies reported at least one descriptive measure. The mean was the most commonly used measure (18 studies), but only five of these also reported the standard deviation. Statistical analyses were performed in 12 studies, with the chi-square test being the most commonly performed test. Graphs were presented in 14 papers. Sixteen papers reported the number of drug prescriptions and/or packages, and ten reported the prevalence of the drug prescription. The mean quality score was 8 (median 9). Only seven of the 22 studies received a score of ≥10, while four studies received a score of <6. Our findings document that only a few of the studies reviewed applied statistical methods and reported data in a satisfactory manner. We therefore conclude that the methodology of drug utilization studies needs to be improved.
Aero-Optics Measurement System for the AEDC Aero-Optics Test Facility
1991-02-01
Pulse Energy Statistics , 150 Pulses ........................................ 41 AEDC-TR-90-20 APPENDIXES A. Optical Performance of Heated Windows...hypersonic wind tunnel, where the requisite extensive statistical database can be developed in a cost- and time-effective manner. Ground testing...At the present time at AEDC, measured AO parameter statistics are derived from sets of image-spot recordings with a set containing as many as 150
Is Cognitive Test-Taking Anxiety Associated With Academic Performance Among Nursing Students?
Duty, Susan M; Christian, Ladonna; Loftus, Jocelyn; Zappi, Victoria
2016-01-01
The cognitive component of test anxiety was correlated with academic performance among nursing students. Modest but statistically significant lower examination grade T scores were observed for students with high compared with low levels of cognitive test anxiety (CTA). High levels of CTA were associated with reduced academic performance.
Nonparametric estimation and testing of fixed effects panel data models
Henderson, Daniel J.; Carroll, Raymond J.; Li, Qi
2009-01-01
In this paper we consider the problem of estimating nonparametric panel data models with fixed effects. We introduce an iterative nonparametric kernel estimator. We also extend the estimation method to the case of a semiparametric partially linear fixed effects model. To determine whether a parametric, semiparametric or nonparametric model is appropriate, we propose test statistics to test between the three alternatives in practice. We further propose a test statistic for testing the null hypothesis of random effects against fixed effects in a nonparametric panel data regression model. Simulations are used to examine the finite sample performance of the proposed estimators and the test statistics. PMID:19444335
Pei, Yanbo; Tian, Guo-Liang; Tang, Man-Lai
2014-11-10
Stratified data analysis is an important research topic in many biomedical studies and clinical trials. In this article, we develop five test statistics for testing the homogeneity of proportion ratios for stratified correlated bilateral binary data based on an equal correlation model assumption. Bootstrap procedures based on these test statistics are also considered. To evaluate the performance of these statistics and procedures, we conduct Monte Carlo simulations to study their empirical sizes and powers under various scenarios. Our results suggest that the procedure based on score statistic performs well generally and is highly recommended. When the sample size is large, procedures based on the commonly used weighted least square estimate and logarithmic transformation with Mantel-Haenszel estimate are recommended as they do not involve any computation of maximum likelihood estimates requiring iterative algorithms. We also derive approximate sample size formulas based on the recommended test procedures. Finally, we apply the proposed methods to analyze a multi-center randomized clinical trial for scleroderma patients. Copyright © 2014 John Wiley & Sons, Ltd.
Sequential CFAR detectors using a dead-zone limiter
NASA Astrophysics Data System (ADS)
Tantaratana, Sawasd
1990-09-01
The performances of some proposed sequential constant-false-alarm-rate (CFAR) detectors are evaluated. The observations are passed through a dead-zone limiter, the output of which is -1, 0, or +1, depending on whether the input is less than -c, between -c and c, or greater than c, where c is a constant. The test statistic is the sum of the outputs. The test is performed on a reduced set of data (those with absolute value larger than c), with the test statistic being the sum of the signs of the reduced set of data. Both constant and linear boundaries are considered. Numerical results show a significant reduction of the average number of observations needed to achieve the same false alarm and detection probabilities as a fixed-sample-size CFAR detector using the same kind of test statistic.
The influence of test mode and visuospatial ability on mathematics assessment performance
NASA Astrophysics Data System (ADS)
Logan, Tracy
2015-12-01
Mathematics assessment and testing are increasingly situated within digital environments with international tests moving to computer-based testing in the near future. This paper reports on a secondary data analysis which explored the influence the mode of assessment—computer-based (CBT) and pencil-and-paper based (PPT)—and visuospatial ability had on students' mathematics test performance. Data from 804 grade 6 Singaporean students were analysed using the knowledge discovery in data design. The results revealed statistically significant differences between performance on CBT and PPT test modes across content areas concerning whole number algebraic patterns and data and chance. However, there were no performance differences for content areas related to spatial arrangements geometric measurement or other number. There were also statistically significant differences in performance between those students who possess higher levels of visuospatial ability compared to those with lower levels across all six content areas. Implications include careful consideration for the comparability of CBT and PPT testing and the need for increased attention to the role of visuospatial reasoning in student's mathematics reasoning.
Ng'andu, N H
1997-03-30
In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.
Assessment of the beryllium lymphocyte proliferation test using statistical process control.
Cher, Daniel J; Deubner, David C; Kelsh, Michael A; Chapman, Pamela S; Ray, Rose M
2006-10-01
Despite more than 20 years of surveillance and epidemiologic studies using the beryllium blood lymphocyte proliferation test (BeBLPT) as a measure of beryllium sensitization (BeS) and as an aid for diagnosing subclinical chronic beryllium disease (CBD), improvements in specific understanding of the inhalation toxicology of CBD have been limited. Although epidemiologic data suggest that BeS and CBD risks vary by process/work activity, it has proven difficult to reach specific conclusions regarding the dose-response relationship between workplace beryllium exposure and BeS or subclinical CBD. One possible reason for this uncertainty could be misclassification of BeS resulting from variation in BeBLPT testing performance. The reliability of the BeBLPT, a biological assay that measures beryllium sensitization, is unknown. To assess the performance of four laboratories that conducted this test, we used data from a medical surveillance program that offered testing for beryllium sensitization with the BeBLPT. The study population was workers exposed to beryllium at various facilities over a 10-year period (1992-2001). Workers with abnormal results were offered diagnostic workups for CBD. Our analyses used a standard statistical technique, statistical process control (SPC), to evaluate test reliability. The study design involved a repeated measures analysis of BeBLPT results generated from the company-wide, longitudinal testing. Analytical methods included use of (1) statistical process control charts that examined temporal patterns of variation for the stimulation index, a measure of cell reactivity to beryllium; (2) correlation analysis that compared prior perceptions of BeBLPT instability to the statistical measures of test variation; and (3) assessment of the variation in the proportion of missing test results and how time periods with more missing data influenced SPC findings. During the period of this study, all laboratories displayed variation in test results that were beyond what would be expected due to chance alone. Patterns of test results suggested that variations were systematic. We conclude that laboratories performing the BeBLPT or other similar biological assays of immunological response could benefit from a statistical approach such as SPC to improve quality management.
Performing Inferential Statistics Prior to Data Collection
ERIC Educational Resources Information Center
Trafimow, David; MacDonald, Justin A.
2017-01-01
Typically, in education and psychology research, the investigator collects data and subsequently performs descriptive and inferential statistics. For example, a researcher might compute group means and use the null hypothesis significance testing procedure to draw conclusions about the populations from which the groups were drawn. We propose an…
Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo
2018-01-01
This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
2011-01-01
Background Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions. Methods An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs. Results When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF. Conclusions We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited. PMID:21342584
Robust Detection of Examinees with Aberrant Answer Changes
ERIC Educational Resources Information Center
Belov, Dmitry I.
2015-01-01
The statistical analysis of answer changes (ACs) has uncovered multiple testing irregularities on large-scale assessments and is now routinely performed at testing organizations. However, AC data has an uncertainty caused by technological or human factors. Therefore, existing statistics (e.g., number of wrong-to-right ACs) used to detect examinees…
Can Percentiles Replace Raw Scores in the Statistical Analysis of Test Data?
ERIC Educational Resources Information Center
Zimmerman, Donald W.; Zumbo, Bruno D.
2005-01-01
Educational and psychological testing textbooks typically warn of the inappropriateness of performing arithmetic operations and statistical analysis on percentiles instead of raw scores. This seems inconsistent with the well-established finding that transforming scores to ranks and using nonparametric methods often improves the validity and power…
Significant lexical relationships
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pedersen, T.; Kayaalp, M.; Bruce, R.
Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.
ERIC Educational Resources Information Center
Delaval, Marine; Michinov, Nicolas; Le Bohec, Olivier; Le Hénaff, Benjamin
2017-01-01
The aim of this study was to examine how social or temporal-self comparison feedback, delivered in real-time in a web-based training environment, could influence the academic performance of students in a statistics examination. First-year psychology students were given the opportunity to train for a statistics examination during a semester by…
ERIC Educational Resources Information Center
Noser, Thomas C.; Tanner, John R.; Shah, Situl
2008-01-01
The purpose of this study was to measure the comprehension of basic mathematical skills of students enrolled in statistics classes at a large regional university, and to determine if the scores earned on a basic math skills test are useful in forecasting student performance in these statistics classes, and to determine if students' basic math…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Angers, Crystal Plume; Bottema, Ryan; Buckley, Les
Purpose: Treatment unit uptime statistics are typically used to monitor radiation equipment performance. The Ottawa Hospital Cancer Centre has introduced the use of Quality Control (QC) test success as a quality indicator for equipment performance and overall health of the equipment QC program. Methods: Implemented in 2012, QATrack+ is used to record and monitor over 1100 routine machine QC tests each month for 20 treatment and imaging units ( http://qatrackplus.com/ ). Using an SQL (structured query language) script, automated queries of the QATrack+ database are used to generate program metrics such as the number of QC tests executed and themore » percentage of tests passing, at tolerance or at action. These metrics are compared against machine uptime statistics already reported within the program. Results: Program metrics for 2015 show good correlation between pass rate of QC tests and uptime for a given machine. For the nine conventional linacs, the QC test success rate was consistently greater than 97%. The corresponding uptimes for these units are better than 98%. Machines that consistently show higher failure or tolerance rates in the QC tests have lower uptimes. This points to either poor machine performance requiring corrective action or to problems with the QC program. Conclusions: QATrack+ significantly improves the organization of QC data but can also aid in overall equipment management. Complimenting machine uptime statistics with QC test metrics provides a more complete picture of overall machine performance and can be used to identify areas of improvement in the machine service and QC programs.« less
Meta-analysis of gene-level associations for rare variants based on single-variant statistics.
Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu
2013-08-08
Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
NASA Astrophysics Data System (ADS)
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
ERIC Educational Resources Information Center
Airola, Denise Tobin
2011-01-01
Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H
2015-01-01
Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
In most all sediment toxicity assessments, the performance of organisms in control sediments is a key parameter in defining sediment toxicity, whether through direct statistical comparison to control or by normalizing to control performance to compare results across sites or batc...
USDA-ARS?s Scientific Manuscript database
Experimental and simulation uncertainties have not been included in many of the statistics used in assessing agricultural model performance. The objectives of this study were to develop an F-test that can be used to evaluate model performance considering experimental and simulation uncertainties, an...
Bayesian models based on test statistics for multiple hypothesis testing problems.
Ji, Yuan; Lu, Yiling; Mills, Gordon B
2008-04-01
We propose a Bayesian method for the problem of multiple hypothesis testing that is routinely encountered in bioinformatics research, such as the differential gene expression analysis. Our algorithm is based on modeling the distributions of test statistics under both null and alternative hypotheses. We substantially reduce the complexity of the process of defining posterior model probabilities by modeling the test statistics directly instead of modeling the full data. Computationally, we apply a Bayesian FDR approach to control the number of rejections of null hypotheses. To check if our model assumptions for the test statistics are valid for various bioinformatics experiments, we also propose a simple graphical model-assessment tool. Using extensive simulations, we demonstrate the performance of our models and the utility of the model-assessment tool. In the end, we apply the proposed methodology to an siRNA screening and a gene expression experiment.
Statistical characterization of the fatigue behavior of composite lamina
NASA Technical Reports Server (NTRS)
Yang, J. N.; Jones, D. L.
1979-01-01
A theoretical model was developed to predict statistically the effects of constant and variable amplitude fatigue loadings on the residual strength and fatigue life of composite lamina. The parameters in the model were established from the results of a series of static tensile tests and a fatigue scan and a number of verification tests were performed. Abstracts for two other papers on the effect of load sequence on the statistical fatigue of composites are also presented.
Cluster Detection Tests in Spatial Epidemiology: A Global Indicator for Performance Assessment
Guttmann, Aline; Li, Xinran; Feschet, Fabien; Gaudart, Jean; Demongeot, Jacques; Boire, Jean-Yves; Ouchchane, Lemlih
2015-01-01
In cluster detection of disease, the use of local cluster detection tests (CDTs) is current. These methods aim both at locating likely clusters and testing for their statistical significance. New or improved CDTs are regularly proposed to epidemiologists and must be subjected to performance assessment. Because location accuracy has to be considered, performance assessment goes beyond the raw estimation of type I or II errors. As no consensus exists for performance evaluations, heterogeneous methods are used, and therefore studies are rarely comparable. A global indicator of performance, which assesses both spatial accuracy and usual power, would facilitate the exploration of CDTs behaviour and help between-studies comparisons. The Tanimoto coefficient (TC) is a well-known measure of similarity that can assess location accuracy but only for one detected cluster. In a simulation study, performance is measured for many tests. From the TC, we here propose two statistics, the averaged TC and the cumulated TC, as indicators able to provide a global overview of CDTs performance for both usual power and location accuracy. We evidence the properties of these two indicators and the superiority of the cumulated TC to assess performance. We tested these indicators to conduct a systematic spatial assessment displayed through performance maps. PMID:26086911
Díaz-González, Lorena; Quiroz-Ruiz, Alfredo
2014-01-01
Using highly precise and accurate Monte Carlo simulations of 20,000,000 replications and 102 independent simulation experiments with extremely low simulation errors and total uncertainties, we evaluated the performance of four single outlier discordancy tests (Grubbs test N2, Dixon test N8, skewness test N14, and kurtosis test N15) for normal samples of sizes 5 to 20. Statistical contaminations of a single observation resulting from parameters called δ from ±0.1 up to ±20 for modeling the slippage of central tendency or ε from ±1.1 up to ±200 for slippage of dispersion, as well as no contamination (δ = 0 and ε = ±1), were simulated. Because of the use of precise and accurate random and normally distributed simulated data, very large replications, and a large number of independent experiments, this paper presents a novel approach for precise and accurate estimations of power functions of four popular discordancy tests and, therefore, should not be considered as a simple simulation exercise unrelated to probability and statistics. From both criteria of the Power of Test proposed by Hayes and Kinsella and the Test Performance Criterion of Barnett and Lewis, Dixon test N8 performs less well than the other three tests. The overall performance of these four tests could be summarized as N2≅N15 > N14 > N8. PMID:24737992
Verma, Surendra P; Díaz-González, Lorena; Rosales-Rivera, Mauricio; Quiroz-Ruiz, Alfredo
2014-01-01
Using highly precise and accurate Monte Carlo simulations of 20,000,000 replications and 102 independent simulation experiments with extremely low simulation errors and total uncertainties, we evaluated the performance of four single outlier discordancy tests (Grubbs test N2, Dixon test N8, skewness test N14, and kurtosis test N15) for normal samples of sizes 5 to 20. Statistical contaminations of a single observation resulting from parameters called δ from ±0.1 up to ±20 for modeling the slippage of central tendency or ε from ±1.1 up to ±200 for slippage of dispersion, as well as no contamination (δ = 0 and ε = ±1), were simulated. Because of the use of precise and accurate random and normally distributed simulated data, very large replications, and a large number of independent experiments, this paper presents a novel approach for precise and accurate estimations of power functions of four popular discordancy tests and, therefore, should not be considered as a simple simulation exercise unrelated to probability and statistics. From both criteria of the Power of Test proposed by Hayes and Kinsella and the Test Performance Criterion of Barnett and Lewis, Dixon test N8 performs less well than the other three tests. The overall performance of these four tests could be summarized as N2≅N15 > N14 > N8.
Statistical modeling of software reliability
NASA Technical Reports Server (NTRS)
Miller, Douglas R.
1992-01-01
This working paper discusses the statistical simulation part of a controlled software development experiment being conducted under the direction of the System Validation Methods Branch, Information Systems Division, NASA Langley Research Center. The experiment uses guidance and control software (GCS) aboard a fictitious planetary landing spacecraft: real-time control software operating on a transient mission. Software execution is simulated to study the statistical aspects of reliability and other failure characteristics of the software during development, testing, and random usage. Quantification of software reliability is a major goal. Various reliability concepts are discussed. Experiments are described for performing simulations and collecting appropriate simulated software performance and failure data. This data is then used to make statistical inferences about the quality of the software development and verification processes as well as inferences about the reliability of software versions and reliability growth under random testing and debugging.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics
Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.
2015-01-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
An Algorithm to Improve Test Answer Copying Detection Using the Omega Statistic
ERIC Educational Resources Information Center
Maeda, Hotaka; Zhang, Bo
2017-01-01
The omega (?) statistic is reputed to be one of the best indices for detecting answer copying on multiple choice tests, but its performance relies on the accurate estimation of copier ability, which is challenging because responses from the copiers may have been contaminated. We propose an algorithm that aims to identify and delete the suspected…
Student Achievement in Undergraduate Statistics: The Potential Value of Allowing Failure
ERIC Educational Resources Information Center
Ferrandino, Joseph A.
2016-01-01
This article details what resulted when I re-designed my undergraduate statistics course to allow failure as a learning strategy and focused on achievement rather than performance. A variety of within and between sample t-tests are utilized to determine the impact of unlimited test and quiz opportunities on student learning on both quizzes and…
An Empirical Investigation of Methods for Assessing Item Fit for Mixed Format Tests
ERIC Educational Resources Information Center
Chon, Kyong Hee; Lee, Won-Chan; Ansley, Timothy N.
2013-01-01
Empirical information regarding performance of model-fit procedures has been a persistent need in measurement practice. Statistical procedures for evaluating item fit were applied to real test examples that consist of both dichotomously and polytomously scored items. The item fit statistics used in this study included the PARSCALE's G[squared],…
A Study of relationship between frailty and physical performance in elderly women.
Jeoung, Bog Ja; Lee, Yang Chool
2015-08-01
Frailty is a disorder of multiple inter-related physiological systems. It is unclear whether the level of physical performance factors can serve as markers of frailty and a sign. The purpose of this study was to examine the relationship between frailty and physical performance in elderly women. One hundred fourteen elderly women participated in this study, their aged was from 65 to 80. We were measured 6-min walk test, grip-strength, 30-sec arm curl test, 30-sec chair stand test, 8 foot Up- and Go, Back scratch, chair sit and reach, unipedal stance, BMI, and the frailty with questionnaire. The collected data were analyzed by descriptive statistics, frequencies, correlation analysis, ANOVA, and simple liner regression using the IBM 21. SPSS program. In results, statistic tests showed that there were significant differences between frailty and 6-min walk test, 30-sec arm curl test, 30-sec chair stand test, grip-strength, Back scratch, and BMI. However, we did not find significant differences between frailty and 8 foot Up- and Go, unipedal stance. When the subjects were divided into five groups according to physical performance level, subjects with high 6-min walk, 30-sec arm curl test, chair sit and reach test, and high grip strength had low score frailty. Physical performance factors were strongly associated with decreased frailty, suggesting that physical performance improvements play an important role in preventing or reducing the frailty.
Testing for independence in J×K contingency tables with complex sample survey data.
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Sinha, Debajyoti; Hevelone, Nathanael; Giovannucci, Edward; Hu, Jim C
2015-09-01
The test of independence of row and column variables in a (J×K) contingency table is a widely used statistical test in many areas of application. For complex survey samples, use of the standard Pearson chi-squared test is inappropriate due to correlation among units within the same cluster. Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) proposed an approach in which the standard Pearson chi-squared statistic is multiplied by a design effect to adjust for the complex survey design. Unfortunately, this test fails to exist when one of the observed cell counts equals zero. Even with the large samples typical of many complex surveys, zero cell counts can occur for rare events, small domains, or contingency tables with a large number of cells. Here, we propose Wald and score test statistics for independence based on weighted least squares estimating equations. In contrast to the Rao-Scott test statistic, the proposed Wald and score test statistics always exist. In simulations, the score test is found to perform best with respect to type I error. The proposed method is motivated by, and applied to, post surgical complications data from the United States' Nationwide Inpatient Sample (NIS) complex survey of hospitals in 2008. © 2015, The International Biometric Society.
Kopp-Schneider, Annette; Prieto, Pilar; Kinsner-Ovaskainen, Agnieszka; Stanzel, Sven
2013-06-01
In the framework of toxicology, a testing strategy can be viewed as a series of steps which are taken to come to a final prediction about a characteristic of a compound under study. The testing strategy is performed as a single-step procedure, usually called a test battery, using simultaneously all information collected on different endpoints, or as tiered approach in which a decision tree is followed. Design of a testing strategy involves statistical considerations, such as the development of a statistical prediction model. During the EU FP6 ACuteTox project, several prediction models were proposed on the basis of statistical classification algorithms which we illustrate here. The final choice of testing strategies was not based on statistical considerations alone. However, without thorough statistical evaluations a testing strategy cannot be identified. We present here a number of observations made from the statistical viewpoint which relate to the development of testing strategies. The points we make were derived from problems we had to deal with during the evaluation of this large research project. A central issue during the development of a prediction model is the danger of overfitting. Procedures are presented to deal with this challenge. Copyright © 2012 Elsevier Ltd. All rights reserved.
Statistical Power in Meta-Analysis
ERIC Educational Resources Information Center
Liu, Jin
2015-01-01
Statistical power is important in a meta-analysis study, although few studies have examined the performance of simulated power in meta-analysis. The purpose of this study is to inform researchers about statistical power estimation on two sample mean difference test under different situations: (1) the discrepancy between the analytical power and…
Testing the Self-Efficacy-Performance Linkage of Social-Cognitive Theory.
ERIC Educational Resources Information Center
Harrison, Allison W.; Rainer, R. Kelly, Jr.; Hochwarter, Wayne A.; Thompson, Kenneth R.
1997-01-01
Briefly reviews Albert Bandura's Self-Efficacy Performance Model (ability to perform a task is influenced by an individual's belief in their capability). Tests this model with a sample of 776 university employees and computer-related knowledge and skills. Results supported Bandura's thesis. Includes statistical tables and a discussion of related…
Robustness of S1 statistic with Hodges-Lehmann for skewed distributions
NASA Astrophysics Data System (ADS)
Ahad, Nor Aishah; Yahaya, Sharipah Soaad Syed; Yin, Lee Ping
2016-10-01
Analysis of variance (ANOVA) is a common use parametric method to test the differences in means for more than two groups when the populations are normally distributed. ANOVA is highly inefficient under the influence of non- normal and heteroscedastic settings. When the assumptions are violated, researchers are looking for alternative such as Kruskal-Wallis under nonparametric or robust method. This study focused on flexible method, S1 statistic for comparing groups using median as the location estimator. S1 statistic was modified by substituting the median with Hodges-Lehmann and the default scale estimator with the variance of Hodges-Lehmann and MADn to produce two different test statistics for comparing groups. Bootstrap method was used for testing the hypotheses since the sampling distributions of these modified S1 statistics are unknown. The performance of the proposed statistic in terms of Type I error was measured and compared against the original S1 statistic, ANOVA and Kruskal-Wallis. The propose procedures show improvement compared to the original statistic especially under extremely skewed distribution.
Test anxiety and academic performance in chiropractic students.
Zhang, Niu; Henderson, Charles N R
2014-01-01
Objective : We assessed the level of students' test anxiety, and the relationship between test anxiety and academic performance. Methods : We recruited 166 third-quarter students. The Test Anxiety Inventory (TAI) was administered to all participants. Total scores from written examinations and objective structured clinical examinations (OSCEs) were used as response variables. Results : Multiple regression analysis shows that there was a modest, but statistically significant negative correlation between TAI scores and written exam scores, but not OSCE scores. Worry and emotionality were the best predictive models for written exam scores. Mean total anxiety and emotionality scores for females were significantly higher than those for males, but not worry scores. Conclusion : Moderate-to-high test anxiety was observed in 85% of the chiropractic students examined. However, total test anxiety, as measured by the TAI score, was a very weak predictive model for written exam performance. Multiple regression analysis demonstrated that replacing total anxiety (TAI) with worry and emotionality (TAI subscales) produces a much more effective predictive model of written exam performance. Sex, age, highest current academic degree, and ethnicity contributed little additional predictive power in either regression model. Moreover, TAI scores were not found to be statistically significant predictors of physical exam skill performance, as measured by OSCEs.
Development of modelling algorithm of technological systems by statistical tests
NASA Astrophysics Data System (ADS)
Shemshura, E. A.; Otrokov, A. V.; Chernyh, V. G.
2018-03-01
The paper tackles the problem of economic assessment of design efficiency regarding various technological systems at the stage of their operation. The modelling algorithm of a technological system was performed using statistical tests and with account of the reliability index allows estimating the level of machinery technical excellence and defining the efficiency of design reliability against its performance. Economic feasibility of its application shall be determined on the basis of service quality of a technological system with further forecasting of volumes and the range of spare parts supply.
The Skillings-Mack test (Friedman test when there are missing data).
Chatfield, Mark; Mander, Adrian
2009-04-01
The Skillings-Mack statistic (Skillings and Mack, 1981, Technometrics 23: 171-177) is a general Friedman-type statistic that can be used in almost any block design with an arbitrary missing-data structure. The missing data can be either missing by design, for example, an incomplete block design, or missing completely at random. The Skillings-Mack test is equivalent to the Friedman test when there are no missing data in a balanced complete block design, and the Skillings-Mack test is equivalent to the test suggested in Durbin (1951, British Journal of Psychology, Statistical Section 4: 85-90) for a balanced incomplete block design. The Friedman test was implemented in Stata by Goldstein (1991, Stata Technical Bulletin 3: 26-27) and further developed in Goldstein (2005, Stata Journal 5: 285). This article introduces the skilmack command, which performs the Skillings-Mack test.The skilmack command is also useful when there are many ties or equal ranks (N.B. the Friedman statistic compared with the chi(2) distribution will give a conservative result), as well as for small samples; appropriate results can be obtained by simulating the distribution of the test statistic under the null hypothesis.
1988-06-01
and PCBs. The pilot program involved screening, testing , and repairing of EMs/PCBs for both COMNAVSEASYSCOM and Commander, Naval Electronic Systems...were chosen from the Support and Test Equipment Engineering Program (STEEP) tests rformed by"IMA San Diego duringl987. A statistical analysis and a Level...were chosen from the Support and Test Equipment Engineering Program (STEEP) tests performed by SIMA San Diego during 1987. A statistical analysis and a
Diestelkamp, Wiebke S; Krane, Carissa M; Pinnell, Margaret F
2011-05-20
Energy-based surgical scalpels are designed to efficiently transect and seal blood vessels using thermal energy to promote protein denaturation and coagulation. Assessment and design improvement of ultrasonic scalpel performance relies on both in vivo and ex vivo testing. The objective of this work was to design and implement a robust, experimental test matrix with randomization restrictions and predictive statistical power, which allowed for identification of those experimental variables that may affect the quality of the seal obtained ex vivo. The design of the experiment included three factors: temperature (two levels); the type of solution used to perfuse the artery during transection (three types); and artery type (two types) resulting in a total of twelve possible treatment combinations. Burst pressures of porcine carotid and renal arteries sealed ex vivo were assigned as the response variable. The experimental test matrix was designed and carried out as a split-plot experiment in order to assess the contributions of several variables and their interactions while accounting for randomization restrictions present in the experimental setup. The statistical software package SAS was utilized and PROC MIXED was used to account for the randomization restrictions in the split-plot design. The combination of temperature, solution, and vessel type had a statistically significant impact on seal quality. The design and implementation of a split-plot experimental test-matrix provided a mechanism for addressing the existing technical randomization restrictions of ex vivo ultrasonic scalpel performance testing, while preserving the ability to examine the potential effects of independent factors or variables. This method for generating the experimental design and the statistical analyses of the resulting data are adaptable to a wide variety of experimental problems involving large-scale tissue-based studies of medical or experimental device efficacy and performance.
Haghani, Fariba; Hatef Khorami, Mohammad; Fakhari, Mohammad
2016-07-01
Feedback cards are recommended as a feasible tool for structured written feedback delivery in clinical education while effectiveness of this tool on the medical students' performance is still questionable. The purpose of this study was to compare the effects of structured written feedback by cards as well as verbal feedback versus verbal feedback alone on the clinical performance of medical students at the Mini Clinical Evaluation Exercise (Mini-CEX) test in an outpatient clinic. This is a quasi-experimental study with pre- and post-test comprising four groups in two terms of medical students' externship. The students' performance was assessed through the Mini-Clinical Evaluation Exercise (Mini-CEX) as a clinical performance evaluation tool. Structured written feedbacks were given to two experimental groups by designed feedback cards as well as verbal feedback, while in the two control groups feedback was delivered verbally as a routine approach in clinical education. By consecutive sampling method, 62 externship students were enrolled in this study and seven students were excluded from the final analysis due to their absence for three days. According to the ANOVA analysis and Post Hoc Tukey test, no statistically significant difference was observed among the four groups at the pre-test, whereas a statistically significant difference was observed between the experimental and control groups at the post-test (F = 4.023, p =0.012). The effect size of the structured written feedbacks on clinical performance was 0.19. Structured written feedback by cards could improve the performance of medical students in a statistical sense. Further studies must be conducted in other clinical courses with longer durations.
ERIC Educational Resources Information Center
Tsui, Joanne M.; Mazzocco, Michele M. M.
2006-01-01
This study was designed to examine the effects of math anxiety and perfectionism on math performance, under timed testing conditions, among mathematically gifted sixth graders. We found that participants had worse math performance during timed versus untimed testing, but this difference was statistically significant only when the timed condition…
ERIC Educational Resources Information Center
Biermann, Carol
1988-01-01
Described is a study designed to introduce students to the behavior of common invertebrate animals, and to use of the chi-square statistical technique. Discusses activities with snails, pill bugs, and mealworms. Provides an abbreviated chi-square table and instructions for performing the experiments and statistical tests. (CW)
Lee, Geunho; Lee, Hyun Beom; Jung, Byung Hwa; Nam, Hojung
2017-07-01
Mass spectrometry (MS) data are used to analyze biological phenomena based on chemical species. However, these data often contain unexpected duplicate records and missing values due to technical or biological factors. These 'dirty data' problems increase the difficulty of performing MS analyses because they lead to performance degradation when statistical or machine-learning tests are applied to the data. Thus, we have developed missing values preprocessor (mvp), an open-source software for preprocessing data that might include duplicate records and missing values. mvp uses the property of MS data in which identical chemical species present the same or similar values for key identifiers, such as the mass-to-charge ratio and intensity signal, and forms cliques via graph theory to process dirty data. We evaluated the validity of the mvp process via quantitative and qualitative analyses and compared the results from a statistical test that analyzed the original and mvp-applied data. This analysis showed that using mvp reduces problems associated with duplicate records and missing values. We also examined the effects of using unprocessed data in statistical tests and examined the improved statistical test results obtained with data preprocessed using mvp.
Predicting driving performance in older adults: we are not there yet!
Bédard, Michel; Weaver, Bruce; Darzins, Peteris; Porter, Michelle M
2008-08-01
We set up this study to determine the predictive value of approaches for which a statistical association with driving performance has been documented. We determined the statistical association (magnitude of association and probability of occurrence by chance alone) between four different predictors (the Mini-Mental State Examination, Trails A test, Useful Field of View [UFOV], and a composite measure of past driving incidents) and driving performance. We then explored the predictive value of these measures with receiver operating characteristic (ROC) curves and various cutoff values. We identified associations between the predictors and driving performance well beyond the play of chance (p < .01). Nonetheless, the predictors had limited predictive value with areas under the curve ranging from .51 to .82. Statistical associations are not sufficient to infer adequate predictive value, especially when crucial decisions such as whether one can continue driving are at stake. The predictors we examined have limited predictive value if used as stand-alone screening tests.
Multiple Phenotype Association Tests Using Summary Statistics in Genome-Wide Association Studies
Liu, Zhonghua; Lin, Xihong
2017-01-01
Summary We study in this paper jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. PMID:28653391
Multiple phenotype association tests using summary statistics in genome-wide association studies.
Liu, Zhonghua; Lin, Xihong
2018-03-01
We study in this article jointly testing the associations of a genetic variant with correlated multiple phenotypes using the summary statistics of individual phenotype analysis from Genome-Wide Association Studies (GWASs). We estimated the between-phenotype correlation matrix using the summary statistics of individual phenotype GWAS analyses, and developed genetic association tests for multiple phenotypes by accounting for between-phenotype correlation without the need to access individual-level data. Since genetic variants often affect multiple phenotypes differently across the genome and the between-phenotype correlation can be arbitrary, we proposed robust and powerful multiple phenotype testing procedures by jointly testing a common mean and a variance component in linear mixed models for summary statistics. We computed the p-values of the proposed tests analytically. This computational advantage makes our methods practically appealing in large-scale GWASs. We performed simulation studies to show that the proposed tests maintained correct type I error rates, and to compare their powers in various settings with the existing methods. We applied the proposed tests to a GWAS Global Lipids Genetics Consortium summary statistics data set and identified additional genetic variants that were missed by the original single-trait analysis. © 2017, The International Biometric Society.
Huber, Stefan; Klein, Elise; Moeller, Korbinian; Willmes, Klaus
2015-10-01
In neuropsychological research, single-cases are often compared with a small control sample. Crawford and colleagues developed inferential methods (i.e., the modified t-test) for such a research design. In the present article, we suggest an extension of the methods of Crawford and colleagues employing linear mixed models (LMM). We first show that a t-test for the significance of a dummy coded predictor variable in a linear regression is equivalent to the modified t-test of Crawford and colleagues. As an extension to this idea, we then generalized the modified t-test to repeated measures data by using LMMs to compare the performance difference in two conditions observed in a single participant to that of a small control group. The performance of LMMs regarding Type I error rates and statistical power were tested based on Monte-Carlo simulations. We found that starting with about 15-20 participants in the control sample Type I error rates were close to the nominal Type I error rate using the Satterthwaite approximation for the degrees of freedom. Moreover, statistical power was acceptable. Therefore, we conclude that LMMs can be applied successfully to statistically evaluate performance differences between a single-case and a control sample. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Dahabreh, Issa J.; Chung, Mei; Kitsios, Georgios D.; Terasawa, Teruhiko; Raman, Gowri; Tatsioni, Athina; Tobar, Annette; Lau, Joseph; Trikalinos, Thomas A.; Schmid, Christopher H.
2013-01-01
We performed a survey of meta-analyses of test performance to describe the evolution in their methods and reporting. Studies were identified through MEDLINE (1966-2009), reference lists, and relevant reviews. We extracted information on clinical topics, literature review methods, quality assessment, and statistical analyses. We reviewed 760…
The Performance of Methods to Test Upper-Level Mediation in the Presence of Nonnormal Data
ERIC Educational Resources Information Center
Pituch, Keenan A.; Stapleton, Laura M.
2008-01-01
A Monte Carlo study compared the statistical performance of standard and robust multilevel mediation analysis methods to test indirect effects for a cluster randomized experimental design under various departures from normality. The performance of these methods was examined for an upper-level mediation process, where the indirect effect is a fixed…
Designing Intervention Studies: Selected Populations, Range Restrictions, and Statistical Power
ERIC Educational Resources Information Center
Miciak, Jeremy; Taylor, W. Pat; Stuebing, Karla K.; Fletcher, Jack M.; Vaughn, Sharon
2016-01-01
An appropriate estimate of statistical power is critical for the design of intervention studies. Although the inclusion of a pretest covariate in the test of the primary outcome can increase statistical power, samples selected on the basis of pretest performance may demonstrate range restriction on the selection measure and other correlated…
Loring, David W; Larrabee, Glenn J
2006-06-01
The Halstead-Reitan Battery has been instrumental in the development of neuropsychological practice in the United States. Although Reitan administered both the Wechsler-Bellevue Intelligence Scale and Halstead's test battery when evaluating Halstead's theory of biologic intelligence, the relative sensitivity of each test battery to brain damage continues to be an area of controversy. Because Reitan did not perform direct parametric analysis to contrast group performances, we reanalyze Reitan's original validation data from both Halstead (Reitan, 1955) and Wechsler batteries (Reitan, 1959a) and calculate effect sizes and probability levels using traditional parametric approaches. Eight of the 10 tests comprising Halstead's original Impairment Index, as well as the Impairment Index itself, statistically differentiated patients with unequivocal brain damage from controls. In addition, 13 of 14 Wechsler measures including Full-Scale IQ also differed statistically between groups (Brain Damage Full-Scale IQ = 96.2; Control Group Full Scale IQ = 112.6). We suggest that differences in the statistical properties of each battery (e.g., raw scores vs. standardized scores) likely contribute to classification characteristics including test sensitivity and specificity.
A nonparametric spatial scan statistic for continuous data.
Jung, Inkyung; Cho, Ho Jin
2015-10-20
Spatial scan statistics are widely used for spatial cluster detection, and several parametric models exist. For continuous data, a normal-based scan statistic can be used. However, the performance of the model has not been fully evaluated for non-normal data. We propose a nonparametric spatial scan statistic based on the Wilcoxon rank-sum test statistic and compared the performance of the method with parametric models via a simulation study under various scenarios. The nonparametric method outperforms the normal-based scan statistic in terms of power and accuracy in almost all cases under consideration in the simulation study. The proposed nonparametric spatial scan statistic is therefore an excellent alternative to the normal model for continuous data and is especially useful for data following skewed or heavy-tailed distributions.
McMullan, Miriam; Jones, Ray; Lea, Susan
2010-04-01
This paper is a report of a correlational study of the relations of age, status, experience and drug calculation ability to numerical ability of nursing students and Registered Nurses. Competent numerical and drug calculation skills are essential for nurses as mistakes can put patients' lives at risk. A cross-sectional study was carried out in 2006 in one United Kingdom university. Validated numerical and drug calculation tests were given to 229 second year nursing students and 44 Registered Nurses attending a non-medical prescribing programme. The numeracy test was failed by 55% of students and 45% of Registered Nurses, while 92% of students and 89% of nurses failed the drug calculation test. Independent of status or experience, older participants (> or = 35 years) were statistically significantly more able to perform numerical calculations. There was no statistically significant difference between nursing students and Registered Nurses in their overall drug calculation ability, but nurses were statistically significantly more able than students to perform basic numerical calculations and calculations for solids, oral liquids and injections. Both nursing students and Registered Nurses were statistically significantly more able to perform calculations for solids, liquid oral and injections than calculations for drug percentages, drip and infusion rates. To prevent deskilling, Registered Nurses should continue to practise and refresh all the different types of drug calculations as often as possible with regular (self)-testing of their ability. Time should be set aside in curricula for nursing students to learn how to perform basic numerical and drug calculations. This learning should be reinforced through regular practice and assessment.
Does sensitivity measured from screening test-sets predict clinical performance?
NASA Astrophysics Data System (ADS)
Soh, BaoLin P.; Lee, Warwick B.; Mello-Thoms, Claudia R.; Tapia, Kriscia A.; Ryan, John; Hung, Wai Tak; Thompson, Graham J.; Heard, Rob; Brennan, Patrick C.
2014-03-01
Aim: To examine the relationship between sensitivity measured from the BREAST test-set and clinical performance. Background: Although the UK and Australia national breast screening programs have regarded PERFORMS and BREAST test-set strategies as possible methods of estimating readers' clinical efficacy, the relationship between test-set and real life performance results has never been satisfactorily understood. Methods: Forty-one radiologists from BreastScreen New South Wales participated in this study. Each reader interpreted a BREAST test-set which comprised sixty de-identified mammographic examinations sourced from the BreastScreen Digital Imaging Library. Spearman's rank correlation coefficient was used to compare the sensitivity measured from the BREAST test-set with screen readers' clinical audit data. Results: Results shown statistically significant positive moderate correlations between test-set sensitivity and each of the following metrics: rate of invasive cancer per 10 000 reads (r=0.495; p < 0.01); rate of small invasive cancer per 10 000 reads (r=0.546; p < 0.001); detection rate of all invasive cancers and DCIS per 10 000 reads (r=0.444; p < 0.01). Conclusion: Comparison between sensitivity measured from the BREAST test-set and real life detection rate demonstrated statistically significant positive moderate correlations which validated that such test-set strategies can reflect readers' clinical performance and be used as a quality assurance tool. The strength of correlation demonstrated in this study was higher than previously found by others.
Yang, Yang; DeGruttola, Victor
2016-01-01
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients. PMID:22740584
Yang, Yang; DeGruttola, Victor
2012-06-22
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients.
Performance of digital RGB reflectance color extraction for plaque lesion
NASA Astrophysics Data System (ADS)
Hashim, Hadzli; Taib, Mohd Nasir; Jailani, Rozita; Sulaiman, Saadiah; Baba, Roshidah
2005-01-01
Several clinical psoriasis lesion groups are been studied for digital RGB color features extraction. Previous works have used samples size that included all the outliers lying beyond the standard deviation factors from the peak histograms. This paper described the statistical performances of the RGB model with and without removing these outliers. Plaque lesion is experimented with other types of psoriasis. The statistical tests are compared with respect to three samples size; the original 90 samples, the first size reduction by removing outliers from 2 standard deviation distances (2SD) and the second size reduction by removing outliers from 1 standard deviation distance (1SD). Quantification of data images through the normal/direct and differential of the conventional reflectance method is considered. Results performances are concluded by observing the error plots with 95% confidence interval and findings of the inference T-tests applied. The statistical tests outcomes have shown that B component for conventional differential method can be used to distinctively classify plaque from the other psoriasis groups in consistent with the error plots finding with an improvement in p-value greater than 0.5.
ENHANCING TEST SENSITIVITY IN TOXICITY TESTING BY USING A STATISTICAL PERFORMANCE STANDARD
Previous reports have shown that within-test sensitivity can vary markedly among laboratories. Experts have advocated an empirical approach to controlling test variability based on the MSD, control means, and other test acceptability criteria. (The MSD represents the smallest dif...
Performance of Reclassification Statistics in Comparing Risk Prediction Models
Paynter, Nina P.
2012-01-01
Concerns have been raised about the use of traditional measures of model fit in evaluating risk prediction models for clinical use, and reclassification tables have been suggested as an alternative means of assessing the clinical utility of a model. Several measures based on the table have been proposed, including the reclassification calibration (RC) statistic, the net reclassification improvement (NRI), and the integrated discrimination improvement (IDI), but the performance of these in practical settings has not been fully examined. We used simulations to estimate the type I error and power for these statistics in a number of scenarios, as well as the impact of the number and type of categories, when adding a new marker to an established or reference model. The type I error was found to be reasonable in most settings, and power was highest for the IDI, which was similar to the test of association. The relative power of the RC statistic, a test of calibration, and the NRI, a test of discrimination, varied depending on the model assumptions. These tools provide unique but complementary information. PMID:21294152
Gene Level Meta-Analysis of Quantitative Traits by Functional Linear Models.
Fan, Ruzong; Wang, Yifan; Boehnke, Michael; Chen, Wei; Li, Yun; Ren, Haobo; Lobach, Iryna; Xiong, Momiao
2015-08-01
Meta-analysis of genetic data must account for differences among studies including study designs, markers genotyped, and covariates. The effects of genetic variants may differ from population to population, i.e., heterogeneity. Thus, meta-analysis of combining data of multiple studies is difficult. Novel statistical methods for meta-analysis are needed. In this article, functional linear models are developed for meta-analyses that connect genetic data to quantitative traits, adjusting for covariates. The models can be used to analyze rare variants, common variants, or a combination of the two. Both likelihood-ratio test (LRT) and F-distributed statistics are introduced to test association between quantitative traits and multiple variants in one genetic region. Extensive simulations are performed to evaluate empirical type I error rates and power performance of the proposed tests. The proposed LRT and F-distributed statistics control the type I error very well and have higher power than the existing methods of the meta-analysis sequence kernel association test (MetaSKAT). We analyze four blood lipid levels in data from a meta-analysis of eight European studies. The proposed methods detect more significant associations than MetaSKAT and the P-values of the proposed LRT and F-distributed statistics are usually much smaller than those of MetaSKAT. The functional linear models and related test statistics can be useful in whole-genome and whole-exome association studies. Copyright © 2015 by the Genetics Society of America.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.
Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J
2015-07-01
Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
A Study of relationship between frailty and physical performance in elderly women
Jeoung, Bog Ja; Lee, Yang Chool
2015-01-01
Frailty is a disorder of multiple inter-related physiological systems. It is unclear whether the level of physical performance factors can serve as markers of frailty and a sign. The purpose of this study was to examine the relationship between frailty and physical performance in elderly women. One hundred fourteen elderly women participated in this study, their aged was from 65 to 80. We were measured 6-min walk test, grip-strength, 30-sec arm curl test, 30-sec chair stand test, 8 foot Up- and Go, Back scratch, chair sit and reach, unipedal stance, BMI, and the frailty with questionnaire. The collected data were analyzed by descriptive statistics, frequencies, correlation analysis, ANOVA, and simple liner regression using the IBM 21. SPSS program. In results, statistic tests showed that there were significant differences between frailty and 6-min walk test, 30-sec arm curl test, 30-sec chair stand test, grip-strength, Back scratch, and BMI. However, we did not find significant differences between frailty and 8 foot Up- and Go, unipedal stance. When the subjects were divided into five groups according to physical performance level, subjects with high 6-min walk, 30-sec arm curl test, chair sit and reach test, and high grip strength had low score frailty. Physical performance factors were strongly associated with decreased frailty, suggesting that physical performance improvements play an important role in preventing or reducing the frailty. PMID:26331137
SMALL COLOUR VISION VARIATIONS AND THEIR EFFECT IN VISUAL COLORIMETRY,
COLOR VISION, PERFORMANCE(HUMAN), TEST EQUIPMENT, PERFORMANCE(HUMAN), CORRELATION TECHNIQUES, STATISTICAL PROCESSES, COLORS, ANALYSIS OF VARIANCE, AGING(MATERIALS), COLORIMETRY , BRIGHTNESS, ANOMALIES, PLASTICS, UNITED KINGDOM.
Statistical assessment of speech system performance
NASA Technical Reports Server (NTRS)
Moshier, Stephen L.
1977-01-01
Methods for the normalization of performance tests results of speech recognition systems are presented. Technological accomplishments in speech recognition systems, as well as planned research activities are described.
Alles, Susan; Peng, Linda X; Mozola, Mark A
2009-01-01
A modification to Performance-Tested Method (PTM) 070601, Reveal Listeria Test (Reveal), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there was a statistically significant difference in performance between the Reveal and reference culture [U.S. Food and Drug Administration's Bacteriological Analytical Manual (FDA/BAM) or U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS)] methods for only a single food in one trial (pasteurized crab meat) at the 27 h enrichment time point, with more positive results obtained with the FDA/BAM reference method. No foods showed statistically significant differences in method performance at the 30 h time point. Independent laboratory testing of 3 foods again produced a statistically significant difference in results for crab meat at the 27 h time point; otherwise results of the Reveal and reference methods were statistically equivalent. Overall, considering both internal and independent laboratory trials, sensitivity of the Reveal method relative to the reference culture procedures in testing of foods was 85.9% at 27 h and 97.1% at 30 h. Results from 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the Reveal method was more productive than the reference USDA-FSIS culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the Reveal method at the 24 h time point. Overall, sensitivity of the Reveal method at 24 h relative to that of the USDA-FSIS method was 153%. The Reveal method exhibited extremely high specificity, with only a single false-positive result in all trials combined for overall specificity of 99.5%.
Uncertainties in Estimates of Fleet Average Fuel Economy : A Statistical Evaluation
DOT National Transportation Integrated Search
1977-01-01
Research was performed to assess the current Federal procedure for estimating the average fuel economy of each automobile manufacturer's new car fleet. Test vehicle selection and fuel economy estimation methods were characterized statistically and so...
Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis
NASA Astrophysics Data System (ADS)
Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan
2017-10-01
This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.
Computerized Classification Testing with the Rasch Model
ERIC Educational Resources Information Center
Eggen, Theo J. H. M.
2011-01-01
If classification in a limited number of categories is the purpose of testing, computerized adaptive tests (CATs) with algorithms based on sequential statistical testing perform better than estimation-based CATs (e.g., Eggen & Straetmans, 2000). In these computerized classification tests (CCTs), the Sequential Probability Ratio Test (SPRT) (Wald,…
Yang, Hyeri; Na, Jihye; Jang, Won-Hee; Jung, Mi-Sook; Jeon, Jun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Lim, Kyung-Min; Bae, SeungJin
2015-05-05
Mouse local lymph node assay (LLNA, OECD TG429) is an alternative test replacing conventional guinea pig tests (OECD TG406) for the skin sensitization test but the use of a radioisotopic agent, (3)H-thymidine, deters its active dissemination. New non-radioisotopic LLNA, LLNA:BrdU-FCM employs a non-radioisotopic analog, 5-bromo-2'-deoxyuridine (BrdU) and flow cytometry. For an analogous method, OECD TG429 performance standard (PS) advises that two reference compounds be tested repeatedly and ECt(threshold) values obtained must fall within acceptable ranges to prove within- and between-laboratory reproducibility. However, this criteria is somewhat arbitrary and sample size of ECt is less than 5, raising concerns about insufficient reliability. Here, we explored various statistical methods to evaluate the reproducibility of LLNA:BrdU-FCM with stimulation index (SI), the raw data for ECt calculation, produced from 3 laboratories. Descriptive statistics along with graphical representation of SI was presented. For inferential statistics, parametric and non-parametric methods were applied to test the reproducibility of SI of a concurrent positive control and the robustness of results were investigated. Descriptive statistics and graphical representation of SI alone could illustrate the within- and between-laboratory reproducibility. Inferential statistics employing parametric and nonparametric methods drew similar conclusion. While all labs passed within- and between-laboratory reproducibility criteria given by OECD TG429 PS based on ECt values, statistical evaluation based on SI values showed that only two labs succeeded in achieving within-laboratory reproducibility. For those two labs that satisfied the within-lab reproducibility, between-laboratory reproducibility could be also attained based on inferential as well as descriptive statistics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Assessment of statistical significance and clinical relevance.
Kieser, Meinhard; Friede, Tim; Gondan, Matthias
2013-05-10
In drug development, it is well accepted that a successful study will demonstrate not only a statistically significant result but also a clinically relevant effect size. Whereas standard hypothesis tests are used to demonstrate the former, it is less clear how the latter should be established. In the first part of this paper, we consider the responder analysis approach and study the performance of locally optimal rank tests when the outcome distribution is a mixture of responder and non-responder distributions. We find that these tests are quite sensitive to their planning assumptions and have therefore not really any advantage over standard tests such as the t-test and the Wilcoxon-Mann-Whitney test, which perform overall well and can be recommended for applications. In the second part, we present a new approach to the assessment of clinical relevance based on the so-called relative effect (or probabilistic index) and derive appropriate sample size formulae for the design of studies aiming at demonstrating both a statistically significant and clinically relevant effect. Referring to recent studies in multiple sclerosis, we discuss potential issues in the application of this approach. Copyright © 2012 John Wiley & Sons, Ltd.
Tankevicius, Gediminas; Lankaite, Doanata; Krisciunas, Aleksandras
2013-08-01
The lack of knowledge about isometric ankle testing indicates the need for research in this area. to assess test-retest reliability and to determine the optimal position for isometric ankle-eversion and -inversion testing. Test-retest reliability study. Isometric ankle eversion and inversion were assessed in 3 different dynamometer foot-plate positions: 0°, 7°, and 14° of inversion. Two maximal repetitions were performed at each angle. Both limbs were tested (40 ankles in total). The test was performed 2 times with a period of 7 d between the tests. University hospital. The study was carried out on 20 healthy athletes with no history of ankle sprains. Reliability was assessed using intraclass correlation coefficient (ICC2,1); minimal detectable change (MDC) was calculated using a 95% confidence interval. Paired t test was used to measure statistically significant changes, and P <.05 was considered statistically significant. Eversion and inversion peak torques showed high ICCs in all 3 angles (ICC values .87-.96, MDC values 3.09-6.81 Nm). Eversion peak torque was the smallest when testing at the 0° angle and gradually increased, reaching maximum values at 14° angle. The increase of eversion peak torque was statistically significant at 7 ° and 14° of inversion. Inversion peak torque showed an opposite pattern-it was the smallest when measured at the 14° angle and increased at the other 2 angles; statistically significant changes were seen only between measures taken at 0° and 14°. Isometric eversion and inversion testing using the Biodex 4 Pro system is a reliable method. The authors suggest that the angle of 7° of inversion is the best for isometric eversion and inversion testing.
Expected p-values in light of an ROC curve analysis applied to optimal multiple testing procedures.
Vexler, Albert; Yu, Jihnhee; Zhao, Yang; Hutson, Alan D; Gurevich, Gregory
2017-01-01
Many statistical studies report p-values for inferential purposes. In several scenarios, the stochastic aspect of p-values is neglected, which may contribute to drawing wrong conclusions in real data experiments. The stochastic nature of p-values makes their use to examine the performance of given testing procedures or associations between investigated factors to be difficult. We turn our focus on the modern statistical literature to address the expected p-value (EPV) as a measure of the performance of decision-making rules. During the course of our study, we prove that the EPV can be considered in the context of receiver operating characteristic (ROC) curve analysis, a well-established biostatistical methodology. The ROC-based framework provides a new and efficient methodology for investigating and constructing statistical decision-making procedures, including: (1) evaluation and visualization of properties of the testing mechanisms, considering, e.g. partial EPVs; (2) developing optimal tests via the minimization of EPVs; (3) creation of novel methods for optimally combining multiple test statistics. We demonstrate that the proposed EPV-based approach allows us to maximize the integrated power of testing algorithms with respect to various significance levels. In an application, we use the proposed method to construct the optimal test and analyze a myocardial infarction disease dataset. We outline the usefulness of the "EPV/ROC" technique for evaluating different decision-making procedures, their constructions and properties with an eye towards practical applications.
Luo, Li; Zhu, Yun
2012-01-01
Abstract The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T2, collapsing method, multivariate and collapsing (CMC) method, individual χ2 test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets. PMID:22651812
Luo, Li; Zhu, Yun; Xiong, Momiao
2012-06-01
The genome-wide association studies (GWAS) designed for next-generation sequencing data involve testing association of genomic variants, including common, low frequency, and rare variants. The current strategies for association studies are well developed for identifying association of common variants with the common diseases, but may be ill-suited when large amounts of allelic heterogeneity are present in sequence data. Recently, group tests that analyze their collective frequency differences between cases and controls shift the current variant-by-variant analysis paradigm for GWAS of common variants to the collective test of multiple variants in the association analysis of rare variants. However, group tests ignore differences in genetic effects among SNPs at different genomic locations. As an alternative to group tests, we developed a novel genome-information content-based statistics for testing association of the entire allele frequency spectrum of genomic variation with the diseases. To evaluate the performance of the proposed statistics, we use large-scale simulations based on whole genome low coverage pilot data in the 1000 Genomes Project to calculate the type 1 error rates and power of seven alternative statistics: a genome-information content-based statistic, the generalized T(2), collapsing method, multivariate and collapsing (CMC) method, individual χ(2) test, weighted-sum statistic, and variable threshold statistic. Finally, we apply the seven statistics to published resequencing dataset from ANGPTL3, ANGPTL4, ANGPTL5, and ANGPTL6 genes in the Dallas Heart Study. We report that the genome-information content-based statistic has significantly improved type 1 error rates and higher power than the other six statistics in both simulated and empirical datasets.
Shifflett, Benjamin; Huang, Rong; Edland, Steven D
2017-01-01
Genotypic association studies are prone to inflated type I error rates if multiple hypothesis testing is performed, e.g., sequentially testing for recessive, multiplicative, and dominant risk. Alternatives to multiple hypothesis testing include the model independent genotypic χ 2 test, the efficiency robust MAX statistic, which corrects for multiple comparisons but with some loss of power, or a single Armitage test for multiplicative trend, which has optimal power when the multiplicative model holds but with some loss of power when dominant or recessive models underlie the genetic association. We used Monte Carlo simulations to describe the relative performance of these three approaches under a range of scenarios. All three approaches maintained their nominal type I error rates. The genotypic χ 2 and MAX statistics were more powerful when testing a strictly recessive genetic effect or when testing a dominant effect when the allele frequency was high. The Armitage test for multiplicative trend was most powerful for the broad range of scenarios where heterozygote risk is intermediate between recessive and dominant risk. Moreover, all tests had limited power to detect recessive genetic risk unless the sample size was large, and conversely all tests were relatively well powered to detect dominant risk. Taken together, these results suggest the general utility of the multiplicative trend test when the underlying genetic model is unknown.
NASA Technical Reports Server (NTRS)
Alston, D. W.
1981-01-01
The considered research had the objective to design a statistical model that could perform an error analysis of curve fits of wind tunnel test data using analysis of variance and regression analysis techniques. Four related subproblems were defined, and by solving each of these a solution to the general research problem was obtained. The capabilities of the evolved true statistical model are considered. The least squares fit is used to determine the nature of the force, moment, and pressure data. The order of the curve fit is increased in order to delete the quadratic effect in the residuals. The analysis of variance is used to determine the magnitude and effect of the error factor associated with the experimental data.
Race, Socioeconomic Status, and Implicit Bias: Implications for Closing the Achievement Gap
NASA Astrophysics Data System (ADS)
Schlosser, Elizabeth Auretta Cox
This study accessed the relationship between race, socioeconomic status, age and the race implicit bias held by middle and high school science teachers in Mobile and Baldwin County Public School Systems. Seventy-nine participants were administered the race Implicit Association Test (race IAT), created by Greenwald, A. G., Nosek, B. A., & Banaji, M. R., (2003) and a demographic survey. Quantitative analysis using analysis of variances, ANOVA and t-tests were used in this study. An ANOVA was performed comparing the race IAT scores of African American science teachers and their Caucasian counterparts. A statically significant difference was found (F = .4.56, p = .01). An ANOVA was also performed using the race IAT scores comparing the age of the participants; the analysis yielded no statistical difference based on age. A t-test was performed comparing the race IAT scores of African American teachers who taught at either Title I or non-Title I schools; no statistical difference was found between groups (t = -17.985, p < .001). A t-test was also performed comparing the race IAT scores of Caucasian teachers who taught at either Title I or non-Title I schools; a statistically significant difference was found between groups ( t = 2.44, p > .001). This research examines the implications of the achievement gap among African American and Caucasian students in science.
Estimation of diagnostic test accuracy without full verification: a review of latent class methods
Collins, John; Huynh, Minh
2014-01-01
The performance of a diagnostic test is best evaluated against a reference test that is without error. For many diseases, this is not possible, and an imperfect reference test must be used. However, diagnostic accuracy estimates may be biased if inaccurately verified status is used as the truth. Statistical models have been developed to handle this situation by treating disease as a latent variable. In this paper, we conduct a systematized review of statistical methods using latent class models for estimating test accuracy and disease prevalence in the absence of complete verification. PMID:24910172
Experimental control in software reliability certification
NASA Technical Reports Server (NTRS)
Trammell, Carmen J.; Poore, Jesse H.
1994-01-01
There is growing interest in software 'certification', i.e., confirmation that software has performed satisfactorily under a defined certification protocol. Regulatory agencies, customers, and prospective reusers all want assurance that a defined product standard has been met. In other industries, products are typically certified under protocols in which random samples of the product are drawn, tests characteristic of operational use are applied, analytical or statistical inferences are made, and products meeting a standard are 'certified' as fit for use. A warranty statement is often issued upon satisfactory completion of a certification protocol. This paper outlines specific engineering practices that must be used to preserve the validity of the statistical certification testing protocol. The assumptions associated with a statistical experiment are given, and their implications for statistical testing of software are described.
A Comparison of Student Understanding of Seasons Using Inquiry and Didactic Teaching Methods
NASA Astrophysics Data System (ADS)
Ashcraft, Paul G.
2006-02-01
Student performance on open-ended questions concerning seasons in a university physical science content course was examined to note differences between classes that experienced inquiry using a 5-E lesson planning model and those that experienced the same content with a traditional, didactic lesson. The class examined is a required content course for elementary education majors and understanding the seasons is part of the university's state's elementary science standards. The two self-selected groups of students showed no statistically significant differences in pre-test scores, while there were statistically significant differences between the groups' post-test scores with those who participated in inquiry-based activities scoring higher. There were no statistically significant differences between the pre-test and the post-test for the students who experienced didactic teaching, while there were statistically significant improvements for the students who experienced the 5-E lesson.
Statistical Analysis Tools for Learning in Engineering Laboratories.
ERIC Educational Resources Information Center
Maher, Carolyn A.
1990-01-01
Described are engineering programs that have used automated data acquisition systems to implement data collection and analyze experiments. Applications include a biochemical engineering laboratory, heat transfer performance, engineering materials testing, mechanical system reliability, statistical control laboratory, thermo-fluid laboratory, and a…
Accelerated battery-life testing - A concept
NASA Technical Reports Server (NTRS)
Mccallum, J.; Thomas, R. E.
1971-01-01
Test program, employing empirical, statistical and physical methods, determines service life and failure probabilities of electrochemical cells and batteries, and is applicable to testing mechanical, electrical, and chemical devices. Data obtained aids long-term performance prediction of battery or cell.
An entropy-based statistic for genomewide association studies.
Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao
2005-07-01
Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.
JAN transistor and diode characterization test program
NASA Technical Reports Server (NTRS)
Takeda, H.
1977-01-01
A statistical summary of electrical characterization was performed on JAN diodes and transistors. Parameters are presented with test conditions, mean, standard deviation, lowest reading, 10% point, 90% point and highest reading.
Flight tests for the assessment of task performance and control activity
NASA Technical Reports Server (NTRS)
Pausder, H. J.; Hummes, D.
1982-01-01
The tests were performed with the helicopters BO 105 and UH-1D. Closely connected with tactical demands the six test pilots' task was to minimize the time and the altitude over the obstacles. The data reduction yields statistical evaluation parameters describing the control activity of the pilots and the achieved task performance. The results are shown in form of evaluation diagrams. Additionally dolphin tests with varied control strategy were performed to get more insight into the influence of control techniques. From these test results recommendations can be derived to emphasize the direct force control and to reduce the collective to pitch crosscoupling for the dolphin.
Vasconcellos, Luiz Felipe; Pereira, João Santos; Adachi, Marcelo; Greca, Denise; Cruz, Manuela; Malak, Ana Lara; Charchat-Fichman, Helenice; Spitz, Mariana
2017-01-01
Few studies have evaluated magnetic resonance imaging (MRI) visual scales in Parkinson's disease-Mild Cognitive Impairment (PD-MCI). We selected 79 PD patients and 92 controls (CO) to perform neurologic and neuropsychological evaluation. Brain MRI was performed to evaluate the following scales: Global Cortical Atrophy (GCA), Fazekas, and medial temporal atrophy (MTA). The analysis revealed that both PD groups (amnestic and nonamnestic) showed worse performance on several tests when compared to CO. Memory, executive function, and attention impairment were more severe in amnestic PD-MCI group. Overall analysis of frequency of MRI visual scales by MCI subtype did not reveal any statistically significant result. Statistically significant inverse correlation was observed between GCA scale and Mini-Mental Status Examination (MMSE), Montreal Cognitive Assessment (MoCA), semantic verbal fluency, Stroop test, figure memory test, trail making test (TMT) B, and Rey Auditory Verbal Learning Test (RAVLT). The MTA scale correlated with Stroop test and Fazekas scale with figure memory test, digit span, and Stroop test according to the subgroup evaluated. Visual scales by MRI in MCI should be evaluated by cognitive domain and might be more useful in more severely impaired MCI or dementia patients.
A novel measure and significance testing in data analysis of cell image segmentation.
Wu, Jin Chu; Halter, Michael; Kacker, Raghu N; Elliott, John T; Plant, Anne L
2017-03-14
Cell image segmentation (CIS) is an essential part of quantitative imaging of biological cells. Designing a performance measure and conducting significance testing are critical for evaluating and comparing the CIS algorithms for image-based cell assays in cytometry. Many measures and methods have been proposed and implemented to evaluate segmentation methods. However, computing the standard errors (SE) of the measures and their correlation coefficient is not described, and thus the statistical significance of performance differences between CIS algorithms cannot be assessed. We propose the total error rate (TER), a novel performance measure for segmenting all cells in the supervised evaluation. The TER statistically aggregates all misclassification error rates (MER) by taking cell sizes as weights. The MERs are for segmenting each single cell in the population. The TER is fully supported by the pairwise comparisons of MERs using 106 manually segmented ground-truth cells with different sizes and seven CIS algorithms taken from ImageJ. Further, the SE and 95% confidence interval (CI) of TER are computed based on the SE of MER that is calculated using the bootstrap method. An algorithm for computing the correlation coefficient of TERs between two CIS algorithms is also provided. Hence, the 95% CI error bars can be used to classify CIS algorithms. The SEs of TERs and their correlation coefficient can be employed to conduct the hypothesis testing, while the CIs overlap, to determine the statistical significance of the performance differences between CIS algorithms. A novel measure TER of CIS is proposed. The TER's SEs and correlation coefficient are computed. Thereafter, CIS algorithms can be evaluated and compared statistically by conducting the significance testing.
Correlation between Na/K ratio and electron densities in blood samples of breast cancer patients.
Topdağı, Ömer; Toker, Ozan; Bakırdere, Sezgin; Bursalıoğlu, Ertuğrul Osman; Öz, Ersoy; Eyecioğlu, Önder; Demir, Mustafa; İçelli, Orhan
2018-05-31
The main purpose of this study was to investigate the relationship between the electron densities and Na/K ratio which has important role in breast cancer disease. Determinations of sodium and potassium concentrations in blood samples performed with inductive coupled plasma-atomic emission spectrometry. Electron density values of blood samples were determined via ZXCOM. Statistical analyses were performed for electron densities and Na/K ratio including Kolmogorov-Smirnov normality tests, Spearman's rank correlation test and Mann-Whitney U test. It was found that the electron densities significantly differ between control and breast cancer groups. In addition, statistically significant positive correlation was found between the electron density and Na/K ratios in breast cancer group.
Quiet eye training facilitates competitive putting performance in elite golfers.
Vine, Samuel J; Moore, Lee J; Wilson, Mark R
2011-01-01
The aim of this study was to examine the effectiveness of a brief quiet eye (QE) training intervention aimed at optimizing visuomotor control and putting performance of elite golfers under pressure, and in real competition. Twenty-two elite golfers (mean handicap 2.7) recorded putting statistics over 10 rounds of competitive golf before attending training individually. Having been randomly assigned to either a QE training or Control group, participants were fitted with an Applied Science Laboratories Mobile Eye tracker and performed 20 baseline (pre-test) putts from 10 ft. Training consisted of video feedback of their gaze behavior while they completed 20 putts; however the QE-trained group received additional instructions related to maintaining a longer QE period. Participants then recorded their putting statistics over a further 10 competitive rounds and re-visited the laboratory for retention and pressure tests of their visuomotor control and putting performance. Overall, the results were supportive of the efficacy of the QE training intervention. QE duration predicted 43% of the variance in putting performance, underlying its critical role in the visuomotor control of putting. The QE-trained group maintained their optimal QE under pressure conditions, whereas the Control group experienced reductions in QE when anxious, with subsequent effects on performance. Although their performance was similar in the pre-test, the QE-trained group holed more putts and left the ball closer to the hole on missed putts than their Control group counterparts in the pressure test. Importantly, these advantages transferred to the golf course, where QE-trained golfers made 1.9 fewer putts per round, compared to pre-training, whereas the Control group showed no change in their putting statistics. These results reveal that QE training, incorporated into a pre-shot routine, is an effective intervention to help golfers maintain control when anxious.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stallmann, F.W.
1984-08-01
A statistical analysis of Charpy test results of the two-year Pressure Vessel Simulation metallurgical irradiation experiment was performed. Determination of transition temperature and upper shelf energy derived from computer fits compare well with eyeball fits. Uncertainties for all results can be obtained with computer fits. The results were compared with predictions in Regulatory Guide 1.99 and other irradiation damage models.
A Note on Comparing the Power of Test Statistics at Low Significance Levels.
Morris, Nathan; Elston, Robert
2011-01-01
It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the relative performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as α = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level α = 5 × 10 -8 , which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.
Repeatability of Cryogenic Multilayer Insulation
NASA Technical Reports Server (NTRS)
Johnson, W. L.; Vanderlaan, M.; Wood, J. J.; Rhys, N. O.; Guo, W.; Van Sciver, S.; Chato, D. J.
2017-01-01
Due to the variety of requirements across aerospace platforms, and one off projects, the repeatability of cryogenic multilayer insulation has never been fully established. The objective of this test program is to provide a more basic understanding of the thermal performance repeatability of MLI systems that are applicable to large scale tanks. There are several different types of repeatability that can be accounted for: these include repeatability between multiple identical blankets, repeatability of installation of the same blanket, and repeatability of a test apparatus. The focus of the work in this report is on the first two types of repeatability. Statistically, repeatability can mean many different things. In simplest form, it refers to the range of performance that a population exhibits and the average of the population. However, as more and more identical components are made (i.e. the population of concern grows), the simple range morphs into a standard deviation from an average performance. Initial repeatability testing on MLI blankets has been completed at Florida State University. Repeatability of five GRC provided coupons with 25 layers was shown to be +/- 8.4 whereas repeatability of repeatedly installing a single coupon was shown to be +/- 8.0. A second group of 10 coupons have been fabricated by Yetispace and tested by Florida State University, through the first 4 tests, the repeatability has been shown to be +/- 16. Based on detailed statistical analysis, the data has been shown to be statistically significant.
Repeatability of Cryogenic Multilayer Insulation
NASA Technical Reports Server (NTRS)
Johnson, W. L.; Vanderlaan, M.; Wood, J. J.; Rhys, N. O.; Guo, W.; Van Sciver, S.; Chato, D. J.
2017-01-01
Due to the variety of requirements across aerospace platforms, and one off projects, the repeatability of cryogenic multilayer insulation has never been fully established. The objective of this test program is to provide a more basic understanding of the thermal performance repeatability of MLI systems that are applicable to large scale tanks. There are several different types of repeatability that can be accounted for: these include repeatability between multiple identical blankets, repeatability of installation of the same blanket, and repeatability of a test apparatus. The focus of the work in this report is on the first two types of repeatability. Statistically, repeatability can mean many different things. In simplest form, it refers to the range of performance that a population exhibits and the average of the population. However, as more and more identical components are made (i.e. the population of concern grows), the simple range morphs into a standard deviation from an average performance. Initial repeatability testing on MLI blankets has been completed at Florida State University. Repeatability of five GRC provided coupons with 25 layers was shown to be +/- 8.4% whereas repeatability of repeatedly installing a single coupon was shown to be +/- 8.0%. A second group of 10 coupons have been fabricated by Yetispace and tested by Florida State University, through the first 4 tests, the repeatability has been shown to be +/- 15-25%. Based on detailed statistical analysis, the data has been shown to be statistically significant.
Goodness-of-Fit Tests for Generalized Normal Distribution for Use in Hydrological Frequency Analysis
NASA Astrophysics Data System (ADS)
Das, Samiran
2018-04-01
The use of three-parameter generalized normal (GNO) as a hydrological frequency distribution is well recognized, but its application is limited due to unavailability of popular goodness-of-fit (GOF) test statistics. This study develops popular empirical distribution function (EDF)-based test statistics to investigate the goodness-of-fit of the GNO distribution. The focus is on the case most relevant to the hydrologist, namely, that in which the parameter values are unidentified and estimated from a sample using the method of L-moments. The widely used EDF tests such as Kolmogorov-Smirnov, Cramer von Mises, and Anderson-Darling (AD) are considered in this study. A modified version of AD, namely, the Modified Anderson-Darling (MAD) test, is also considered and its performance is assessed against other EDF tests using a power study that incorporates six specific Wakeby distributions (WA-1, WA-2, WA-3, WA-4, WA-5, and WA-6) as the alternative distributions. The critical values of the proposed test statistics are approximated using Monte Carlo techniques and are summarized in chart and regression equation form to show the dependence of shape parameter and sample size. The performance results obtained from the power study suggest that the AD and a variant of the MAD (MAD-L) are the most powerful tests. Finally, the study performs case studies involving annual maximum flow data of selected gauged sites from Irish and US catchments to show the application of the derived critical values and recommends further assessments to be carried out on flow data sets of rivers with various hydrological regimes.
Specialized data analysis of SSME and advanced propulsion system vibration measurements
NASA Technical Reports Server (NTRS)
Coffin, Thomas; Swanson, Wayne L.; Jong, Yen-Yi
1993-01-01
The basic objectives of this contract were to perform detailed analysis and evaluation of dynamic data obtained during Space Shuttle Main Engine (SSME) test and flight operations, including analytical/statistical assessment of component dynamic performance, and to continue the development and implementation of analytical/statistical models to effectively define nominal component dynamic characteristics, detect anomalous behavior, and assess machinery operational conditions. This study was to provide timely assessment of engine component operational status, identify probable causes of malfunction, and define feasible engineering solutions. The work was performed under three broad tasks: (1) Analysis, Evaluation, and Documentation of SSME Dynamic Test Results; (2) Data Base and Analytical Model Development and Application; and (3) Development and Application of Vibration Signature Analysis Techniques.
Saadati, Farzaneh; Ahmad Tarmizi, Rohani; Mohd Ayub, Ahmad Fauzi; Abu Bakar, Kamariah
2015-01-01
Because students' ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is 'value added' because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students' problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students.
Atmospheric statistics for aerospace vehicle operations
NASA Technical Reports Server (NTRS)
Smith, O. E.; Batts, G. W.
1993-01-01
Statistical analysis of atmospheric variables was performed for the Shuttle Transportation System (STS) design trade studies and the establishment of launch commit criteria. Atmospheric constraint statistics have been developed for the NASP test flight, the Advanced Launch System, and the National Launch System. The concepts and analysis techniques discussed in the paper are applicable to the design and operations of any future aerospace vehicle.
ERIC Educational Resources Information Center
Hood, Michelle; Creed, Peter A.; Neumann, David L.
2012-01-01
We tested a model of the relationship between attitudes toward statistics and achievement based on Eccles' Expectancy Value Model (1983). Participants (n = 149; 83% female) were second-year Australian university students in a psychology statistics course (mean age = 23.36 years, SD = 7.94 years). We obtained demographic details, past performance,…
The Effect of Project Based Learning on the Statistical Literacy Levels of Student 8th Grade
ERIC Educational Resources Information Center
Koparan, Timur; Güven, Bülent
2014-01-01
This study examines the effect of project based learning on 8th grade students' statistical literacy levels. A performance test was developed for this aim. Quasi-experimental research model was used in this article. In this context, the statistics were taught with traditional method in the control group and it was taught using project based…
The Effects of Pre-Lecture Quizzes on Test Anxiety and Performance in a Statistics Course
ERIC Educational Resources Information Center
Brown, Michael J.; Tallon, Jennifer
2015-01-01
The purpose of our study was to examine the effects of pre-lecture quizzes in a statistics course. Students (N = 70) from 2 sections of an introductory statistics course served as participants in this study. One section completed pre-lecture quizzes whereas the other section did not. Completing pre-lecture quizzes was associated with improved exam…
Yang, Yi; Tokita, Midori; Ishiguchi, Akira
2018-01-01
A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed.
NASA Technical Reports Server (NTRS)
Oravec, Heather Ann; Daniels, Christopher C.
2014-01-01
The National Aeronautics and Space Administration has been developing a novel docking system to meet the requirements of future exploration missions to low-Earth orbit and beyond. A dynamic gas pressure seal is located at the main interface between the active and passive mating components of the new docking system. This seal is designed to operate in the harsh space environment, but is also to perform within strict loading requirements while maintaining an acceptable level of leak rate. In this study, a candidate silicone elastomer seal was designed, and multiple subscale test articles were manufactured for evaluation purposes. The force required to fully compress each test article at room temperature was quantified and found to be below the maximum allowable load for the docking system. However, a significant amount of scatter was observed in the test results. Due to the stochastic nature of the mechanical performance of this candidate docking seal, a statistical process control technique was implemented to isolate unusual compression behavior from typical mechanical performance. The results of this statistical analysis indicated a lack of process control, suggesting a variation in the manufacturing phase of the process. Further investigation revealed that changes in the manufacturing molding process had occurred which may have influenced the mechanical performance of the seal. This knowledge improves the chance of this and future space seals to satisfy or exceed design specifications.
Analysis of visual quality improvements provided by known tools for HDR content
NASA Astrophysics Data System (ADS)
Kim, Jaehwan; Alshina, Elena; Lee, JongSeok; Park, Youngo; Choi, Kwang Pyo
2016-09-01
In this paper, the visual quality of different solutions for high dynamic range (HDR) compression using MPEG test contents is analyzed. We also simulate the method for an efficient HDR compression which is based on statistical property of the signal. The method is compliant with HEVC specification and also easily compatible with other alternative methods which might require HEVC specification changes. It was subjectively tested on commercial TVs and compared with alternative solutions for HDR coding. Subjective visual quality tests were performed using SUHD TVs model which is SAMSUNG JS9500 with maximum luminance up to 1000nit in test. The solution that is based on statistical property shows not only improvement of objective performance but improvement of visual quality compared to other HDR solutions, while it is compatible with HEVC specification.
Global Active Stretching (SGA®) Practice for Judo Practitioners’ Physical Performance Enhancement
ALMEIDA, HELENO; DE SOUZA, RAPHAEL F.; AIDAR, FELIPE J.; DA SILVA, ALISSON G.; REGI, RICARDO P.; BASTOS, AFRÂNIO A.
2018-01-01
In order to analyze the Global Active Stretching (SGA®) practice on the physical performance enhancement in judo-practitioner competitors, 12 male athletes from Judo Federation of Sergipe (Federação Sergipana de Judô), were divided into two groups: Experimental Group (EG) and Control Group (CG). For 10 weeks, the EG practiced SGA® self-postures and the CG practiced assorted calisthenic exercises. All of them were submitted to a variety of tests (before and after): handgrip strength, flexibility, upper limbs’ muscle power, isometric pull-up force, lower limbs’ muscle power (squat-jump – SJ and countermovement jump – CMJ) and Tokui Waza test. Due to the small number of people in the sample, the data were considered non-parametric and then we applied the Wilcoxon test using the software R version 3.3.2 (R Development Core Team, Austria). The effect size was calculated and considered statistically significant the values p ≤ 0.05. Concerning the results, the EG statistical differences were highlighted in flexibility, upper limbs’ muscle power and lower limbs’ muscle power (CMJ), with a gain of 3.00 ± (1.09) cm, 0,42 ± (0,51) m and 2.49 ± (0.63) cm, respectively. The CG only presented statistical difference in the lower limbs’ test (CMJ), with a gain of 0,55 ± 2,28 cm. Thus, the main results pointed out statistical differences before and after in the EG in the flexibility, upper limbs and lower limbs’ muscle power (CMJ), with a gain of 3.00 ± 1.09 cm, 0.42 ± 0.51 m 2.49 ± 0.63 cm, respectively. On the other hand, the CG presented a statistical difference only the lower limbs’ CMJ test, with a gain of 0.55 ± 2.28 cm. The regular 10-week practice of SGA® self-postures increased judoka practitioners’ posterior chain flexibility and vertical jumping (CMJ) performance. PMID:29795746
Weigh-in-Motion Sensor and Controller Operation and Performance Comparison
DOT National Transportation Integrated Search
2018-01-01
This research project utilized statistical inference and comparison techniques to compare the performance of different Weigh-in-Motion (WIM) sensors. First, we analyzed test-vehicle data to perform an accuracy check of the results reported by the sen...
[Sem: a suitable statistical software adaptated for research in oncology].
Kwiatkowski, F; Girard, M; Hacene, K; Berlie, J
2000-10-01
Many softwares have been adapted for medical use; they rarely enable conveniently both data management and statistics. A recent cooperative work ended up in a new software, Sem (Statistics Epidemiology Medicine), which allows data management of trials and, as well, statistical treatments on them. Very convenient, it can be used by non professional in statistics (biologists, doctors, researchers, data managers), since usually (excepted with multivariate models), the software performs by itself the most adequate test, after what complementary tests can be requested if needed. Sem data base manager (DBM) is not compatible with usual DBM: this constitutes a first protection against loss of privacy. Other shields (passwords, cryptage...) strengthen data security, all the more necessary today since Sem can be run on computers nets. Data organization enables multiplicity: forms can be duplicated by patient. Dates are treated in a special but transparent manner (sorting, date and delay calculations...). Sem communicates with common desktop softwares, often with a simple copy/paste. So, statistics can be easily performed on data stored in external calculation sheets, and slides by pasting graphs with a single mouse click (survival curves...). Already used over fifty places in different hospitals for daily work, this product, combining data management and statistics, appears to be a convenient and innovative solution.
Atopy patch test reactions to house dust mites in patients with scabies.
Taşkapan, Oktay; Harmanyeri, Yavuz
2005-01-01
It is well known that the house dust and the scabies mites are related phylogenetically. We therefore performed atopy patch tests with house dust mite antigens (Dermatophagoides pteronyssinus (Dp) and/or Dermatophagoides farinae (Df)) in scabies patients without atopy and healthy controls. We studied 25 men with active scabies and 25 healthy controls. Skin prick tests with standardized house dust mite extract were performed for all patients and controls. An intradermal test procedure was carried out in skin prick test-negative patients, and for controls showing positive atopy patch test to Dp and/or Df. While atopy patch tests were performed directly in all healthy controls, patients with scabies were first treated and on the next day, atopy patch tests were performed. Twenty-two of 25 patients with scabies (88%) had skin prick test and/or intradermal test positivity against house dust mites, whereas 17/25 patients (68%) had atopy patch test positivity against house dust mites (Dp and/or Df). There was no statistically significant difference between skin prick test and/or intradermal test positivity and atopy patch test positivity in a regression analysis (p=0.222). The only statistically significant correlation was between atopy patch test positivity and the extent of scabies involvement (p<0.05). Only few of the healthy controls had positive tests. In this study, we have shown that a positive atopy patch test to house dust mite antigens is not specific for patients with atopic dermatitis, but also occurs in scabies patients without a history of atopic dermatitis.
The influence of various test plans on mission reliability. [for Shuttle Spacelab payloads
NASA Technical Reports Server (NTRS)
Stahle, C. V.; Gongloff, H. R.; Young, J. P.; Keegan, W. B.
1977-01-01
Methods have been developed for the evaluation of cost effective vibroacoustic test plans for Shuttle Spacelab payloads. The shock and vibration environments of components have been statistically represented, and statistical decision theory has been used to evaluate the cost effectiveness of five basic test plans with structural test options for two of the plans. Component, subassembly, and payload testing have been performed for each plan along with calculations of optimum test levels and expected costs. The tests have been ranked according to both minimizing expected project costs and vibroacoustic reliability. It was found that optimum costs may vary up to $6 million with the lowest plan eliminating component testing and maintaining flight vibration reliability via subassembly tests at high acoustic levels.
The effect of inclusion classrooms on the science achievement of general education students
NASA Astrophysics Data System (ADS)
Dodd, Matthew Robert
General education and Special Education students from three high schools in Rutherford County were sampled to determine the effect on their academic achievement on the Tennessee Biology I Gateway Exam in Inclusion classrooms. Each student's predicted and actual Gateway Exam scores from the academic year 2006--2007 were used to determine the effect the student's classroom had on his academic achievement. Independent variables used in the study were gender, ethnicity, socioeconomic level, grade point average, type of classroom (general or Inclusion), and type student (General Education or Special Education). The statistical tests used in this study were a t-test and a Mann--Whitney U Test. From this study, the effect of the Inclusion classroom on general education students was not significant statistically. Although the Inclusion classroom allows the special education student to succeed in the classroom, the effect on general education students is negligible. This study also provided statistical data that the Inclusion classroom did not improve the special education students' academic performances on the Gateway Exam. Students in a general education classroom with a GPA above 3.000 and those from a household without a low socioeconomic status performed at a statistically different level in this study.
Detecting Test Tampering Using Item Response Theory
ERIC Educational Resources Information Center
Wollack, James A.; Cohen, Allan S.; Eckerly, Carol A.
2015-01-01
Test tampering, especially on tests for educational accountability, is an unfortunate reality, necessitating that the state (or its testing vendor) perform data forensic analyses, such as erasure analyses, to look for signs of possible malfeasance. Few statistical approaches exist for detecting fraudulent erasures, and those that do largely do not…
The Real World Significance of Performance Prediction
ERIC Educational Resources Information Center
Pardos, Zachary A.; Wang, Qing Yang; Trivedi, Shubhendu
2012-01-01
In recent years, the educational data mining and user modeling communities have been aggressively introducing models for predicting student performance on external measures such as standardized tests as well as within-tutor performance. While these models have brought statistically reliable improvement to performance prediction, the real world…
Berry, Christopher M; Zhao, Peng
2015-01-01
Predictive bias studies have generally suggested that cognitive ability test scores overpredict job performance of African Americans, meaning these tests are not predictively biased against African Americans. However, at least 2 issues call into question existing over-/underprediction evidence: (a) a bias identified by Aguinis, Culpepper, and Pierce (2010) in the intercept test typically used to assess over-/underprediction and (b) a focus on the level of observed validity instead of operational validity. The present study developed and utilized a method of assessing over-/underprediction that draws on the math of subgroup regression intercept differences, does not rely on the biased intercept test, allows for analysis at the level of operational validity, and can use meta-analytic estimates as input values. Therefore, existing meta-analytic estimates of key parameters, corrected for relevant statistical artifacts, were used to determine whether African American job performance remains overpredicted at the level of operational validity. African American job performance was typically overpredicted by cognitive ability tests across levels of job complexity and across conditions wherein African American and White regression slopes did and did not differ. Because the present study does not rely on the biased intercept test and because appropriate statistical artifact corrections were carried out, the present study's results are not affected by the 2 issues mentioned above. The present study represents strong evidence that cognitive ability tests generally overpredict job performance of African Americans. (c) 2015 APA, all rights reserved.
2011-01-01
Background Energy-based surgical scalpels are designed to efficiently transect and seal blood vessels using thermal energy to promote protein denaturation and coagulation. Assessment and design improvement of ultrasonic scalpel performance relies on both in vivo and ex vivo testing. The objective of this work was to design and implement a robust, experimental test matrix with randomization restrictions and predictive statistical power, which allowed for identification of those experimental variables that may affect the quality of the seal obtained ex vivo. Methods The design of the experiment included three factors: temperature (two levels); the type of solution used to perfuse the artery during transection (three types); and artery type (two types) resulting in a total of twelve possible treatment combinations. Burst pressures of porcine carotid and renal arteries sealed ex vivo were assigned as the response variable. Results The experimental test matrix was designed and carried out as a split-plot experiment in order to assess the contributions of several variables and their interactions while accounting for randomization restrictions present in the experimental setup. The statistical software package SAS was utilized and PROC MIXED was used to account for the randomization restrictions in the split-plot design. The combination of temperature, solution, and vessel type had a statistically significant impact on seal quality. Conclusions The design and implementation of a split-plot experimental test-matrix provided a mechanism for addressing the existing technical randomization restrictions of ex vivo ultrasonic scalpel performance testing, while preserving the ability to examine the potential effects of independent factors or variables. This method for generating the experimental design and the statistical analyses of the resulting data are adaptable to a wide variety of experimental problems involving large-scale tissue-based studies of medical or experimental device efficacy and performance. PMID:21599963
Relative Performance of HPV and Cytology Components of Cotesting in Cervical Screening.
Schiffman, Mark; Kinney, Walter K; Cheung, Li C; Gage, Julia C; Fetterman, Barbara; Poitras, Nancy E; Lorey, Thomas S; Wentzensen, Nicolas; Befano, Brian; Schussler, John; Katki, Hormuzd A; Castle, Philip E
2018-05-01
The main goal of cervical screening programs is to detect and treat precancer before cancer develops. Human papillomavirus (HPV) testing is more sensitive than cytology for detecting precancer. However, reports of rare HPV-negative, cytology-positive cancers are motivating continued use of both tests (cotesting) despite increased testing costs. We quantified the detection of cervical precancer and cancer by cotesting compared with HPV testing alone at Kaiser Permanente Northern California (KPNC), where 1 208 710 women age 30 years and older have undergone triennial cervical cotesting since 2003. Screening histories preceding cervical cancers (n = 623) and precancers (n = 5369) were examined to assess the relative contribution of the cytology and HPV test components in identifying cases. The performances of HPV testing and cytology were compared using contingency table methods, general estimating equation models, and nonparametric statistics; all statistical tests were two-sided. HPV testing identified more women subsequently diagnosed with cancer (P < .001) and precancer (P < .001) than cytology. HPV testing was statistically significantly more likely to be positive for cancer at any time point (P < .001), except within 12 months (P = .10). HPV-negative/cytology-positive results preceded only small fractions of cases of precancer (3.5%) and cancer (5.9%); these cancers were more likely to be regional or distant stage with squamous histopathology than other cases. Given the rarity of cancers among screened women, the contribution of cytology to screening translated to earlier detection of at most five cases per million women per year. Two-thirds (67.9%) of women found to have cancer during 10 years of follow-up at KPNC were detected by the first cotest performed. The added sensitivity of cotesting vs HPV alone for detection of treatable cancer affected extremely few women.
Visual acuity in young elite motorsport athletes: a preliminary report.
Schneiders, Anthony G; Sullivan, S John; Rathbone, Emma J; Louise Thayer, A; Wallis, Laura M; Wilson, Alexandra E
2010-05-01
To determine whether elite motorsport athletes demonstrate superior levels of Visual Acuity than age and sex-matched controls. A cross-sectional observational study. A University vision and balance laboratory. Young male motorsport athletes from the New Zealand Elite Motorsport Academy and healthy age and sex-matched controls. Vision performance tests comprising; Static Visual Acuity (SVA), Dynamic Visual Acuity (DVA), Gaze Stabilization Test (GST), and the Perception Time Test (PTT). Motorsport athletes demonstrated superior visual acuity compared to age and sex-matched controls for all measures, and while this was not statistically significant for SVA, GST and DVA, it reached statistical significance for the PTT (p
Relation between arithmetic performance and phonological working memory in children.
Silva, Kelly da; Zuanetti, Patrícia Aparecida; Borcat, Vanessa Trombini Ribeiro; Guedes-Granzotti, Raphaela Barroso; Kuroishi, Rita Cristina Sadako; Domenis, Daniele Ramos; Fukuda, Marisa Tomoe Hebihara
2017-08-17
To compare the results of Loop Phonological Working Memory (LPWM) in children without global learning alterations, with lower and average/higher arithmetic performance. The study was conducted with 30 children, between the ages of seven and nine years old, who attended the second or third grade of elementary school in the public network. Exclusion criteria were children with suggestive signs of hearing loss, neurological disorders, poor performance in the reading comprehension test or in speech therapy. The children included in the study were submitted to the subtest of arithmetic of Academic Achievement Test for division into two groups (G1 and G2). The G1 was composed of children with low performance in arithmetic and G2 for children with average/higher performance in arithmetic. All children were submitted to PWM assessment through the repetition of pseudowords test. Statistical analysis was performed using the Mann-Whitney test and a p-value <0.05 was considered significant. The study included 20 girls and 10 boys, mean age 8.7 years. The G1 was composed of 17 children and G2 of 13 children. There was a statistically significant difference between the groups studied for the repetition of pseudowords with three and four syllables. The results of this study provide support for the hypothesis that changes in phonological working memory are related to difficulties in arithmetic tests.
ERIC Educational Resources Information Center
Soyibo, Kola; Pinnock, Jacqueline
2005-01-01
This study aimed at establishing if the level of performance of 500 Jamaican Grade 11 students on an achievement test on the concept of respiration was satisfactory (mean = 28 or 70% and above) or not (less than 70%); if there were statistically significant differences in their performance on the concept linked to their gender, cognitive abilities…
NASA Astrophysics Data System (ADS)
Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir
2016-03-01
The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.
Self tuning system for industrial surveillance
Stephan, Wegerich W; Jarman, Kristin K.; Gross, Kenneth C.
2000-01-01
A method and system for automatically establishing operational parameters of a statistical surveillance system. The method and system performs a frequency domain transition on time dependent data, a first Fourier composite is formed, serial correlation is removed, a series of Gaussian whiteness tests are performed along with an autocorrelation test, Fourier coefficients are stored and a second Fourier composite is formed. Pseudorandom noise is added, a Monte Carlo simulation is performed to establish SPRT missed alarm probabilities and tested with a synthesized signal. A false alarm test is then emperically evaluated and if less than a desired target value, then SPRT probabilities are used for performing surveillance.
40 CFR 80.47 - Performance-based Analytical Test Method Approach.
Code of Federal Regulations, 2014 CFR
2014-07-01
... chemistry and statistics, or at least a bachelor's degree in chemical engineering, from an accredited... be compensated for any known chemical interferences using good laboratory practices. (3) The test... section, individual test results shall be compensated for any known chemical interferences using good...
Performance map of a cluster detection test using extended power
2013-01-01
Background Conventional power studies possess limited ability to assess the performance of cluster detection tests. In particular, they cannot evaluate the accuracy of the cluster location, which is essential in such assessments. Furthermore, they usually estimate power for one or a few particular alternative hypotheses and thus cannot assess performance over an entire region. Takahashi and Tango developed the concept of extended power that indicates both the rate of null hypothesis rejection and the accuracy of the cluster location. We propose a systematic assessment method, using here extended power, to produce a map showing the performance of cluster detection tests over an entire region. Methods To explore the behavior of a cluster detection test on identical cluster types at any possible location, we successively applied four different spatial and epidemiological parameters. These parameters determined four cluster collections, each covering the entire study region. We simulated 1,000 datasets for each cluster and analyzed them with Kulldorff’s spatial scan statistic. From the area under the extended power curve, we constructed a map for each parameter set showing the performance of the test across the entire region. Results Consistent with previous studies, the performance of the spatial scan statistic increased with the baseline incidence of disease, the size of the at-risk population and the strength of the cluster (i.e., the relative risk). Performance was heterogeneous, however, even for very similar clusters (i.e., similar with respect to the aforementioned factors), suggesting the influence of other factors. Conclusions The area under the extended power curve is a single measure of performance and, although needing further exploration, it is suitable to conduct a systematic spatial evaluation of performance. The performance map we propose enables epidemiologists to assess cluster detection tests across an entire study region. PMID:24156765
Biofeedback-assisted relaxation training to decrease test anxiety in nursing students.
Prato, Catherine A; Yucha, Carolyn B
2013-01-01
Nursing students experiencing debilitating test anxiety may be unable to demonstrate their knowledge and have potential for poor academic performance. A biofeedback-assisted relaxation training program was created to reduce test anxiety. Anxiety was measured using Spielberger's Test Anxiety Inventory and monitoring peripheral skin temperature, pulse, and respiration rates during the training. Participants were introduced to diaphragmatic breathing, progressive muscle relaxation, and autogenic training. Statistically significant changes occurred in respiratory rates and skin temperatures during the diaphragmatic breathing session; respiratory rates and peripheral skin temperatures during progressive muscle relaxation session; respiratory and pulse rates, and peripheral skin temperatures during the autogenic sessions. No statistically significant difference was noted between the first and second TAI. Subjective test anxiety scores of the students did not decrease by the end of training. Autogenic training session was most effective in showing a statistically significant change in decreased respiratory and pulse rates and increased peripheral skin temperature.
Statistical validation of normal tissue complication probability models.
Xu, Cheng-Jian; van der Schaaf, Arjen; Van't Veld, Aart A; Langendijk, Johannes A; Schilstra, Cornelis
2012-09-01
To investigate the applicability and value of double cross-validation and permutation tests as established statistical approaches in the validation of normal tissue complication probability (NTCP) models. A penalized regression method, LASSO (least absolute shrinkage and selection operator), was used to build NTCP models for xerostomia after radiation therapy treatment of head-and-neck cancer. Model assessment was based on the likelihood function and the area under the receiver operating characteristic curve. Repeated double cross-validation showed the uncertainty and instability of the NTCP models and indicated that the statistical significance of model performance can be obtained by permutation testing. Repeated double cross-validation and permutation tests are recommended to validate NTCP models before clinical use. Copyright © 2012 Elsevier Inc. All rights reserved.
Ferguson, John; Wheeler, William; Fu, YiPing; Prokunina-Olsson, Ludmila; Zhao, Hongyu; Sampson, Joshua
2013-01-01
With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave). PMID:23092956
NASA Astrophysics Data System (ADS)
Colone, L.; Hovgaard, M. K.; Glavind, L.; Brincker, R.
2018-07-01
A method for mass change detection on wind turbine blades using natural frequencies is presented. The approach is based on two statistical tests. The first test decides if there is a significant mass change and the second test is a statistical group classification based on Linear Discriminant Analysis. The frequencies are identified by means of Operational Modal Analysis using natural excitation. Based on the assumption of Gaussianity of the frequencies, a multi-class statistical model is developed by combining finite element model sensitivities in 10 classes of change location on the blade, the smallest area being 1/5 of the span. The method is experimentally validated for a full scale wind turbine blade in a test setup and loaded by natural wind. Mass change from natural causes was imitated with sand bags and the algorithm was observed to perform well with an experimental detection rate of 1, localization rate of 0.88 and mass estimation rate of 0.72.
Time series, periodograms, and significance
NASA Astrophysics Data System (ADS)
Hernandez, G.
1999-05-01
The geophysical literature shows a wide and conflicting usage of methods employed to extract meaningful information on coherent oscillations from measurements. This makes it difficult, if not impossible, to relate the findings reported by different authors. Therefore, we have undertaken a critical investigation of the tests and methodology used for determining the presence of statistically significant coherent oscillations in periodograms derived from time series. Statistical significance tests are only valid when performed on the independent frequencies present in a measurement. Both the number of possible independent frequencies in a periodogram and the significance tests are determined by the number of degrees of freedom, which is the number of true independent measurements, present in the time series, rather than the number of sample points in the measurement. The number of degrees of freedom is an intrinsic property of the data, and it must be determined from the serial coherence of the time series. As part of this investigation, a detailed study has been performed which clearly illustrates the deleterious effects that the apparently innocent and commonly used processes of filtering, de-trending, and tapering of data have on periodogram analysis and the consequent difficulties in the interpretation of the statistical significance thus derived. For the sake of clarity, a specific example of actual field measurements containing unevenly-spaced measurements, gaps, etc., as well as synthetic examples, have been used to illustrate the periodogram approach, and pitfalls, leading to the (statistical) significance tests for the presence of coherent oscillations. Among the insights of this investigation are: (1) the concept of a time series being (statistically) band limited by its own serial coherence and thus having a critical sampling rate which defines one of the necessary requirements for the proper statistical design of an experiment; (2) the design of a critical test for the maximum number of significant frequencies which can be used to describe a time series, while retaining intact the variance of the test sample; (3) a demonstration of the unnecessary difficulties that manipulation of the data brings into the statistical significance interpretation of said data; and (4) the resolution and correction of the apparent discrepancy in significance results obtained by the use of the conventional Lomb-Scargle significance test, when compared with the long-standing Schuster-Walker and Fisher tests.
Factors related to student performance in statistics courses in Lebanon
NASA Astrophysics Data System (ADS)
Naccache, Hiba Salim
The purpose of the present study was to identify factors that may contribute to business students in Lebanese universities having difficulty in introductory and advanced statistics courses. Two statistics courses are required for business majors at Lebanese universities. Students are not obliged to be enrolled in any math courses prior to taking statistics courses. Drawing on recent educational research, this dissertation attempted to identify the relationship between (1) students’ scores on Lebanese university math admissions tests; (2) students’ scores on a test of very basic mathematical concepts; (3) students’ scores on the survey of attitude toward statistics (SATS); (4) course performance as measured by students’ final scores in the course; and (5) their scores on the final exam. Data were collected from 561 students enrolled in multiple sections of two courses: 307 students in the introductory statistics course and 260 in the advanced statistics course in seven campuses across Lebanon over one semester. The multiple regressions results revealed four significant relationships at the introductory level: between students’ scores on the math quiz with their (1) final exam scores; (2) their final averages; (3) the Cognitive subscale of the SATS with their final exam scores; and (4) their final averages. These four significant relationships were also found at the advanced level. In addition, two more significant relationships were found between students’ final average and the two subscales of Effort (5) and Affect (6). No relationship was found between students’ scores on the admission math tests and both their final exam scores and their final averages in both the introductory and advanced level courses. On the other hand, there was no relationship between students’ scores on Lebanese admissions tests and their final achievement. Although these results were consistent across course formats and instructors, they may encourage Lebanese universities to assess the effectiveness of prerequisite math courses. Moreover, these findings may lead the Lebanese Ministry of Education to make changes to the admissions exams, course prerequisites, and course content. Finally, to enhance the attitude of students, new learning techniques, such as group work during class meetings can be helpful, and future research should aim to test the effectiveness of these pedagogical techniques on students’ attitudes toward statistics.
JAN transistor and diode characterization test program, JANTX diode 1N5619
NASA Technical Reports Server (NTRS)
Takeda, H.
1977-01-01
A statistical summary of electrical characterization was performed on JANTX 1N5619 silicon diodes. Parameters are presented with test conditions, mean, standard deviation, lowest reading, 10% point, 90% point, and highest reading.
Statistical analysis of target acquisition sensor modeling experiments
NASA Astrophysics Data System (ADS)
Deaver, Dawne M.; Moyer, Steve
2015-05-01
The U.S. Army RDECOM CERDEC NVESD Modeling and Simulation Division is charged with the development and advancement of military target acquisition models to estimate expected soldier performance when using all types of imaging sensors. Two elements of sensor modeling are (1) laboratory-based psychophysical experiments used to measure task performance and calibrate the various models and (2) field-based experiments used to verify the model estimates for specific sensors. In both types of experiments, it is common practice to control or measure environmental, sensor, and target physical parameters in order to minimize uncertainty of the physics based modeling. Predicting the minimum number of test subjects required to calibrate or validate the model should be, but is not always, done during test planning. The objective of this analysis is to develop guidelines for test planners which recommend the number and types of test samples required to yield a statistically significant result.
NASA Technical Reports Server (NTRS)
Sprowls, D. O.; Bucci, R. J.; Ponchel, B. M.; Brazill, R. L.; Bretz, P. E.
1984-01-01
A technique is demonstrated for accelerated stress corrosion testing of high strength aluminum alloys. The method offers better precision and shorter exposure times than traditional pass fail procedures. The approach uses data from tension tests performed on replicate groups of smooth specimens after various lengths of exposure to static stress. The breaking strength measures degradation in the test specimen load carrying ability due to the environmental attack. Analysis of breaking load data by extreme value statistics enables the calculation of survival probabilities and a statistically defined threshold stress applicable to the specific test conditions. A fracture mechanics model is given which quantifies depth of attack in the stress corroded specimen by an effective flaw size calculated from the breaking stress and the material strength and fracture toughness properties. Comparisons are made with experimental results from three tempers of 7075 alloy plate tested by the breaking load method and by traditional tests of statistically loaded smooth tension bars and conventional precracked specimens.
An operational definition of a statistically meaningful trend.
Bryhn, Andreas C; Dimberg, Peter H
2011-04-28
Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
Teacher Effects, Value-Added Models, and Accountability
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2014-01-01
Background: In the last decade, the effects of teachers on student performance (typically manifested as state-wide standardized tests) have been re-examined using statistical models that are known as value-added models. These statistical models aim to compute the unique contribution of the teachers in promoting student achievement gains from grade…
Application of Transformations in Parametric Inference
ERIC Educational Resources Information Center
Brownstein, Naomi; Pensky, Marianna
2008-01-01
The objective of the present paper is to provide a simple approach to statistical inference using the method of transformations of variables. We demonstrate performance of this powerful tool on examples of constructions of various estimation procedures, hypothesis testing, Bayes analysis and statistical inference for the stress-strength systems.…
Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan
2006-07-15
ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.
On the assessment of the added value of new predictive biomarkers.
Chen, Weijie; Samuelson, Frank W; Gallas, Brandon D; Kang, Le; Sahiner, Berkman; Petrick, Nicholas
2013-07-29
The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.
49 CFR Appendix A to Part 665 - Tests To Be Performed at the Bus Testing Facility
Code of Federal Regulations, 2010 CFR
2010-10-01
.... Because the operator will not become familiar with the detailed design of all new bus models that are tested, tests to determine the time and skill required to remove and reinstall an engine, a transmission... feasible to conduct statistical reliability tests. The detected bus failures, repair time, and the actions...
49 CFR Appendix A to Part 665 - Tests To Be Performed at the Bus Testing Facility
Code of Federal Regulations, 2011 CFR
2011-10-01
.... Because the operator will not become familiar with the detailed design of all new bus models that are tested, tests to determine the time and skill required to remove and reinstall an engine, a transmission... feasible to conduct statistical reliability tests. The detected bus failures, repair time, and the actions...
49 CFR Appendix A to Part 665 - Tests To Be Performed at the Bus Testing Facility
Code of Federal Regulations, 2013 CFR
2013-10-01
.... Because the operator will not become familiar with the detailed design of all new bus models that are tested, tests to determine the time and skill required to remove and reinstall an engine, a transmission... feasible to conduct statistical reliability tests. The detected bus failures, repair time, and the actions...
Rojas, Jorge A; Bernal, Jaime E; García, Mary A; Zarante, Ignacio; Ramírez, Natalia; Bernal, Constanza; Gelvez, Nancy; Tamayo, Marta L
2014-10-01
The aim of this study was to investigate the characteristics and performance of transient evoked oto-acoustic emission (TEOAE) hearing screening in newborns in Colombia, and analyze all possible variables and factors affecting the results. An observational, descriptive and retrospective study with bivariate analysis was performed. The study population consisted of 56,822 newborns evaluated at the private institution, PREGEN. TEOAE testing was carried out as a pediatric hearing screening test from December 2003 to March 2012. The database from PREGEN was revised, and the protocol for evaluation included the same screening test performed twice. Demographic characteristics were recorded and the newborn's background was evaluated. Basic statistics of the qualitative and quantitative variables, and statistical analysis were obtained using the chi-square test. Of the 56,822 records examined, 0.28% were classed as abnormal, which corresponded to a prevalence of 1 in 350. In the screened newborns, 0.08% had a major abnormality or other clinical condition diagnosed, and 0.29% reported a family history of hearing loss. A prevalence of 6.7 in 10,000 was obtained for microtia, which is similar to the 6.4 in 10,000 previously reported in Colombia (database of the Latin-American Collaborative Study of Congenital Malformations - ECLAMC). Statistical analysis demonstrated an association between presenting with a major anomaly and a higher frequency of abnormal results on both TEOAE tests. Newborns in Colombia do not currently undergo screening for the early detection of hearing impairment. The results from this study suggest TEOAE screening tests, when performed twice, are able to detect hearing abnormalities in newborns. This highlights the need to improve the long-term evaluation and monitoring of patients in Colombia through diagnostic tests, and to provide tests that are both sensitive and specific. Furthermore, the use of TEOAE screening is justified by the favorable cost: benefit ratio demonstrated in many countries worldwide. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Robust inference for group sequential trials.
Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei
2017-03-01
For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mahfuz, H.; Maniruzzaman, M.; Vaidya, U.
1997-04-01
Monotonic tensile and fatigue response of continuous silicon carbide fiber reinforced silicon nitride (SiC{sub f}/Si{sub 3}N{sub 4}) composites has been investigated. The monotonic tensile tests have been performed at room and elevated temperatures. Fatigue tests have been conducted at room temperature (RT), at a stress ratio, R = 0.1 and a frequency of 5 Hz. It is observed during the monotonic tests that the composites retain only 30% of its room temperature strength at 1,600 C suggesting a substantial chemical degradation of the matrix at that temperature. The softening of the matrix at elevated temperature also causes reduction in tensilemore » modulus, and the total reduction in modulus is around 45%. Fatigue data have been generated at three load levels and the fatigue strength of the composite has been found to be considerably high; about 75% of its ultimate room temperature strength. Extensive statistical analysis has been performed to understand the degree of scatter in the fatigue as well as in the static test data. Weibull shape factors and characteristic values have been determined for each set of tests and their relationship with the response of the composites has been discussed. A statistical fatigue life prediction method developed from the Weibull distribution is also presented. Maximum Likelihood Estimator with censoring techniques and data pooling schemes has been employed to determine the distribution parameters for the statistical analysis. These parameters have been used to generate the S-N diagram with desired level of reliability. Details of the statistical analysis and the discussion of the static and fatigue behavior of the composites are presented in this paper.« less
Arsic, S; Konstantinovic, Lj; Eminovic, F; Pavlovic, D; Popovic, M B; Arsic, V
2015-01-01
It is considered that cognitive function and attention could affect walking, motion control, and proper conduct during the walk. To determine whether there is a difference in the quality of attention and cognitive ability in stroke patients and patients without neurological damage of similar age and education and to determine whether the connection of attention and cognition affects motor skills, the sample consisted of 50 stroke patients tested with hemiparesis, involved in the process of rehabilitation, and 50 persons, randomly chosen, without neurological damage. The survey used the following tests: Trail Making (TMT A B) test for assessing the flexibility of attention; Mini-Mental State Examination (MMSE) for cognitive status; Functional Ambulation Category (FAC) test to assess the functional status and parameters of walk: speed, frequency, and length of stride; STEP test for assessing the precision of movement and balance. With stroke patients, relationship between age and performance on the MMSE test was marginally significant. The ratio of performance to TMT A B test and years does not indicate statistical significance, while statistical significance between the MMSE test performance and education exists. In stroke patients, performance on MMSE test is correlated with the frequency and length of stride walk. The quality of cognitive function and attention is associated with motor skills but differs in stroke patients and people without neurological damage of similar age. The significance of this correlation can supplement research in neurorehabilitation, improve the quality of medical rehabilitation, and contribute to efficient recovery of these patients.
Repeatability of Cryogenic Multilayer Insulation
NASA Astrophysics Data System (ADS)
Johnson, W. L.; Vanderlaan, M.; Wood, J. J.; Rhys, N. O.; Guo, W.; Van Sciver, S.; Chato, D. J.
2017-12-01
Due to the variety of requirements across aerospace platforms, and one off projects, the repeatability of cryogenic multilayer insulation (MLI) has never been fully established. The objective of this test program is to provide a more basic understanding of the thermal performance repeatability of MLI systems that are applicable to large scale tanks. There are several different types of repeatability that can be accounted for: these include repeatability between identical blankets, repeatability of installation of the same blanket, and repeatability of a test apparatus. The focus of the work in this report is on the first two types of repeatability. Statistically, repeatability can mean many different things. In simplest form, it refers to the range of performance that a population exhibits and the average of the population. However, as more and more identical components are made (i.e. the population of concern grows), the simple range morphs into a standard deviation from an average performance. Initial repeatability testing on MLI blankets has been completed at Florida State University. Repeatability of five Glenn Research Center (GRC) provided coupons with 25 layers was shown to be +/- 8.4% whereas repeatability of repeatedly installing a single coupon was shown to be +/- 8.0%. A second group of 10 coupons has been fabricated by Yetispace and tested by Florida State University, the repeatability between coupons has been shown to be +/- 15-25%. Based on detailed statistical analysis, the data has been shown to be statistically significant.
Yang, Yi; Tokita, Midori; Ishiguchi, Akira
2018-01-01
A number of studies revealed that our visual system can extract different types of summary statistics, such as the mean and variance, from sets of items. Although the extraction of such summary statistics has been studied well in isolation, the relationship between these statistics remains unclear. In this study, we explored this issue using an individual differences approach. Observers viewed illustrations of strawberries and lollypops varying in size or orientation and performed four tasks in a within-subject design, namely mean and variance discrimination tasks with size and orientation domains. We found that the performances in the mean and variance discrimination tasks were not correlated with each other and demonstrated that extractions of the mean and variance are mediated by different representation mechanisms. In addition, we tested the relationship between performances in size and orientation domains for each summary statistic (i.e. mean and variance) and examined whether each summary statistic has distinct processes across perceptual domains. The results illustrated that statistical summary representations of size and orientation may share a common mechanism for representing the mean and possibly for representing variance. Introspections for each observer performing the tasks were also examined and discussed. PMID:29399318
NASA Technical Reports Server (NTRS)
Tripp, John S.; Tcheng, Ping
1999-01-01
Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.
An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics.
Kim, Junghi; Bai, Yun; Pan, Wei
2015-12-01
We study the problem of testing for single marker-multiple phenotype associations based on genome-wide association study (GWAS) summary statistics without access to individual-level genotype and phenotype data. For most published GWASs, because obtaining summary data is substantially easier than accessing individual-level phenotype and genotype data, while often multiple correlated traits have been collected, the problem studied here has become increasingly important. We propose a powerful adaptive test and compare its performance with some existing tests. We illustrate its applications to analyses of a meta-analyzed GWAS dataset with three blood lipid traits and another with sex-stratified anthropometric traits, and further demonstrate its potential power gain over some existing methods through realistic simulation studies. We start from the situation with only one set of (possibly meta-analyzed) genome-wide summary statistics, then extend the method to meta-analysis of multiple sets of genome-wide summary statistics, each from one GWAS. We expect the proposed test to be useful in practice as more powerful than or complementary to existing methods. © 2015 WILEY PERIODICALS, INC.
Intranasal Rapamycin Rescues Mice from Staphylococcal Enterotoxin B-Induced Shock
2012-09-18
PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) U.S. Army Medical Research Institute of...Infectious Diseases,Fort Detrick,MD,21702 8. PERFORMING ORGANIZATION REPORT NUMBER 9. SPONSORING/MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR...Student’s t-test. Statistical comparisons of survival data were performed by Fisher’s exact test with Stata software (Stata Corp., College Station, TX
Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods
NASA Technical Reports Server (NTRS)
Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan
2016-01-01
The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.
Testing independence of bivariate interval-censored data using modified Kendall's tau statistic.
Kim, Yuneung; Lim, Johan; Park, DoHwan
2015-11-01
In this paper, we study a nonparametric procedure to test independence of bivariate interval censored data; for both current status data (case 1 interval-censored data) and case 2 interval-censored data. To do it, we propose a score-based modification of the Kendall's tau statistic for bivariate interval-censored data. Our modification defines the Kendall's tau statistic with expected numbers of concordant and disconcordant pairs of data. The performance of the modified approach is illustrated by simulation studies and application to the AIDS study. We compare our method to alternative approaches such as the two-stage estimation method by Sun et al. (Scandinavian Journal of Statistics, 2006) and the multiple imputation method by Betensky and Finkelstein (Statistics in Medicine, 1999b). © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Abstracts of ARI Research Publications, FY 1978
1980-09-01
initial item pool, 49 items were identified as having signifi- cant item-to-total-score correlations and were statistically determined to address a...failing. Differences among the three groups on main gun performance measures and the previous experience of gun- ners were not statistically significant...forms of the noncognitive cod- ing speed test; and (d) a second field administration to derive norms and other statistical characteristics of the new
The Shock and Vibration Digest. Volume 14, Number 12
1982-12-01
to evaluate the uses of statistical energy analysis for determining sound transmission performance. Coupling loss factors were mea- sured and compared...measurements for the artificial (Also see No. 2623) cracks in mild-steel test pieces. 82-2676 Ihprovement of the Method of Statistical Energy Analysis for...eters, using a large number of free-response time histories In the application of the statistical energy analysis theory simultaneously in one analysis
GPR-Based Water Leak Models in Water Distribution Systems
Ayala-Cabrera, David; Herrera, Manuel; Izquierdo, Joaquín; Ocaña-Levario, Silvia J.; Pérez-García, Rafael
2013-01-01
This paper addresses the problem of leakage in water distribution systems through the use of ground penetrating radar (GPR) as a nondestructive method. Laboratory tests are performed to extract features of water leakage from the obtained GPR images. Moreover, a test in a real-world urban system under real conditions is performed. Feature extraction is performed by interpreting GPR images with the support of a pre-processing methodology based on an appropriate combination of statistical methods and multi-agent systems. The results of these tests are presented, interpreted, analyzed and discussed in this paper.
NASA Astrophysics Data System (ADS)
Maries, Alexandru; Singh, Chandralekha
2015-12-01
It has been found that activation of a stereotype, for example by indicating one's gender before a test, typically alters performance in a way consistent with the stereotype, an effect called "stereotype threat." On a standardized conceptual physics assessment, we found that asking test takers to indicate their gender right before taking the test did not deteriorate performance compared to an equivalent group who did not provide gender information. Although a statistically significant gender gap was present on the standardized test whether or not students indicated their gender, no gender gap was observed on the multiple-choice final exam students took, which included both quantitative and conceptual questions on similar topics.
Saadati, Farzaneh; Ahmad Tarmizi, Rohani
2015-01-01
Because students’ ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is ‘value added’ because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students’ problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students. PMID:26132553
el Galta, Rachid; Uitte de Willige, Shirley; de Visser, Marieke C H; Helmer, Quinta; Hsu, Li; Houwing-Duistermaat, Jeanine J
2007-09-24
In this paper, we propose a one degree of freedom test for association between a candidate gene and a binary trait. This method is a generalization of Terwilliger's likelihood ratio statistic and is especially powerful for the situation of one associated haplotype. As an alternative to the likelihood ratio statistic, we derive a score statistic, which has a tractable expression. For haplotype analysis, we assume that phase is known. By means of a simulation study, we compare the performance of the score statistic to Pearson's chi-square statistic and the likelihood ratio statistic proposed by Terwilliger. We illustrate the method on three candidate genes studied in the Leiden Thrombophilia Study. We conclude that the statistic follows a chi square distribution under the null hypothesis and that the score statistic is more powerful than Terwilliger's likelihood ratio statistic when the associated haplotype has frequency between 0.1 and 0.4 and has a small impact on the studied disorder. With regard to Pearson's chi-square statistic, the score statistic has more power when the associated haplotype has frequency above 0.2 and the number of variants is above five.
Environmental Health Practice: Statistically Based Performance Measurement
Enander, Richard T.; Gagnon, Ronald N.; Hanumara, R. Choudary; Park, Eugene; Armstrong, Thomas; Gute, David M.
2007-01-01
Objectives. State environmental and health protection agencies have traditionally relied on a facility-by-facility inspection-enforcement paradigm to achieve compliance with government regulations. We evaluated the effectiveness of a new approach that uses a self-certification random sampling design. Methods. Comprehensive environmental and occupational health data from a 3-year statewide industry self-certification initiative were collected from representative automotive refinishing facilities located in Rhode Island. Statistical comparisons between baseline and postintervention data facilitated a quantitative evaluation of statewide performance. Results. The analysis of field data collected from 82 randomly selected automotive refinishing facilities showed statistically significant improvements (P<.05, Fisher exact test) in 4 major performance categories: occupational health and safety, air pollution control, hazardous waste management, and wastewater discharge. Statistical significance was also shown when a modified Bonferroni adjustment for multiple comparisons was performed. Conclusions. Our findings suggest that the new self-certification approach to environmental and worker protection is effective and can be used as an adjunct to further enhance state and federal enforcement programs. PMID:17267709
Visual-Motor Test Performance: Race and Achievement Variables.
ERIC Educational Resources Information Center
Fuller, Gerald B.; Friedrich, Douglas
1979-01-01
Rural Black and White children of variant academic achievement were tested on the Minnesota Percepto-Diagnostic Test, which consists of six gestalt designs for the subject to copy. Analyses resulted only in a significant achievement effect; when intellectual level was statistically controlled, race was not a significant variable. (Editor/SJL)
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
The Differential Effect of Sustained Operations on Psychomotor Skills of Helicopter Pilots.
McMahon, Terry W; Newman, David G
2018-06-01
Flying a helicopter is a complex psychomotor skill requiring constant control inputs from pilots. A deterioration in psychomotor performance of a helicopter pilot may be detrimental to operational safety. The aim of this study was to test the hypothesis that psychomotor performance deteriorates over time during sustained operations and that the effect is more pronounced in the feet than the hands. The subjects were helicopter pilots conducting sustained multicrew offshore flight operations in a demanding environment. The remote flight operations involved constant workload in hot environmental conditions with complex operational tasking. Over a period of 6 d 10 helicopter pilots were tested. At the completion of daily flying duties, a helicopter-specific screen-based compensatory tracking task measuring tracking accuracy (over a 5-min period) tested both hands and feet. Data were compared over time and tested for statistical significance for both deterioration and differential effect. A statistically significant deterioration of psychomotor performance was evident in the pilots over time for both hands and feet. There was also a statistically significant differential effect between the hands and the feet in terms of tracking accuracy. The hands recorded a 22.6% decrease in tracking accuracy, while the feet recorded a 39.9% decrease in tracking accuracy. The differential effect may be due to prioritization of limb movement by the motor cortex due to factors such as workload-induced cognitive fatigue. This may result in a greater reduction in performance in the feet than the hands, posing a significant risk to operational safety.McMahon TW, Newman DG. The differential effect of sustained operations on psychomotor skills of helicopter pilots. Aerosp Med Hum Perform. 2018; 89(6):496-502.
Rare-Variant Association Analysis: Study Designs and Statistical Tests
Lee, Seunggeung; Abecasis, Gonçalo R.; Boehnke, Michael; Lin, Xihong
2014-01-01
Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions. PMID:24995866
Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi
2015-01-01
Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y; Chen, Wei
2016-02-01
Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, here we develop Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT), which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. © 2016 WILEY PERIODICALS, INC.
Gene-based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions
Fan, Ruzong; Wang, Yifan; Yan, Qi; Ding, Ying; Weeks, Daniel E.; Lu, Zhaohui; Ren, Haobo; Cook, Richard J; Xiong, Momiao; Swaroop, Anand; Chew, Emily Y.; Chen, Wei
2015-01-01
Summary Genetic studies of survival outcomes have been proposed and conducted recently, but statistical methods for identifying genetic variants that affect disease progression are rarely developed. Motivated by our ongoing real studies, we develop here Cox proportional hazard models using functional regression (FR) to perform gene-based association analysis of survival traits while adjusting for covariates. The proposed Cox models are fixed effect models where the genetic effects of multiple genetic variants are assumed to be fixed. We introduce likelihood ratio test (LRT) statistics to test for associations between the survival traits and multiple genetic variants in a genetic region. Extensive simulation studies demonstrate that the proposed Cox RF LRT statistics have well-controlled type I error rates. To evaluate power, we compare the Cox FR LRT with the previously developed burden test (BT) in a Cox model and sequence kernel association test (SKAT) which is based on mixed effect Cox models. The Cox FR LRT statistics have higher power than or similar power as Cox SKAT LRT except when 50%/50% causal variants had negative/positive effects and all causal variants are rare. In addition, the Cox FR LRT statistics have higher power than Cox BT LRT. The models and related test statistics can be useful in the whole genome and whole exome association studies. An age-related macular degeneration dataset was analyzed as an example. PMID:26782979
Furlan, Leonardo; Sterr, Annette
2018-01-01
Motor learning studies face the challenge of differentiating between real changes in performance and random measurement error. While the traditional p -value-based analyses of difference (e.g., t -tests, ANOVAs) provide information on the statistical significance of a reported change in performance scores, they do not inform as to the likely cause or origin of that change, that is, the contribution of both real modifications in performance and random measurement error to the reported change. One way of differentiating between real change and random measurement error is through the utilization of the statistics of standard error of measurement (SEM) and minimal detectable change (MDC). SEM is estimated from the standard deviation of a sample of scores at baseline and a test-retest reliability index of the measurement instrument or test employed. MDC, in turn, is estimated from SEM and a degree of confidence, usually 95%. The MDC value might be regarded as the minimum amount of change that needs to be observed for it to be considered a real change, or a change to which the contribution of real modifications in performance is likely to be greater than that of random measurement error. A computer-based motor task was designed to illustrate the applicability of SEM and MDC to motor learning research. Two studies were conducted with healthy participants. Study 1 assessed the test-retest reliability of the task and Study 2 consisted in a typical motor learning study, where participants practiced the task for five consecutive days. In Study 2, the data were analyzed with a traditional p -value-based analysis of difference (ANOVA) and also with SEM and MDC. The findings showed good test-retest reliability for the task and that the p -value-based analysis alone identified statistically significant improvements in performance over time even when the observed changes could in fact have been smaller than the MDC and thereby caused mostly by random measurement error, as opposed to by learning. We suggest therefore that motor learning studies could complement their p -value-based analyses of difference with statistics such as SEM and MDC in order to inform as to the likely cause or origin of any reported changes in performance.
Preparing for the first meeting with a statistician.
De Muth, James E
2008-12-15
Practical statistical issues that should be considered when performing data collection and analysis are reviewed. The meeting with a statistician should take place early in the research development before any study data are collected. The process of statistical analysis involves establishing the research question, formulating a hypothesis, selecting an appropriate test, sampling correctly, collecting data, performing tests, and making decisions. Once the objectives are established, the researcher can determine the characteristics or demographics of the individuals required for the study, how to recruit volunteers, what type of data are needed to answer the research question(s), and the best methods for collecting the required information. There are two general types of statistics: descriptive and inferential. Presenting data in a more palatable format for the reader is called descriptive statistics. Inferential statistics involve making an inference or decision about a population based on results obtained from a sample of that population. In order for the results of a statistical test to be valid, the sample should be representative of the population from which it is drawn. When collecting information about volunteers, researchers should only collect information that is directly related to the study objectives. Important information that a statistician will require first is an understanding of the type of variables involved in the study and which variables can be controlled by researchers and which are beyond their control. Data can be presented in one of four different measurement scales: nominal, ordinal, interval, or ratio. Hypothesis testing involves two mutually exclusive and exhaustive statements related to the research question. Statisticians should not be replaced by computer software, and they should be consulted before any research data are collected. When preparing to meet with a statistician, the pharmacist researcher should be familiar with the steps of statistical analysis and consider several questions related to the study to be conducted.
Quantifying Contextual Interference and Its Effect on Skill Transfer in Skilled Youth Tennis Players
Buszard, Tim; Reid, Machar; Krause, Lyndon; Kovalchik, Stephanie; Farrow, Damian
2017-01-01
The contextual interference effect is a well-established motor learning phenomenon. Most of the contextual interference effect literature has addressed simple skills, while less is known about the role of contextual interference in complex sport skill practice, particularly with respect to skilled performers. The purpose of this study was to assess contextual interference when practicing the tennis serve. Study 1 evaluated tennis serve practice of nine skilled youth tennis players using a novel statistical metric developed specifically to measure between-skill and within-skill variability as sources of contextual interference. This metric highlighted that skilled tennis players typically engaged in serve practice that featured low contextual interference. In Study 2, 16 skilled youth tennis players participated in 10 practice sessions that aimed to improve serving “down the T.” Participants were stratified into a low contextual interference practice group (Low CI) and a moderate contextual interference practice group (Moderate CI). Pre- and post-tests were conducted 1 week before and 1 week after the practice period. Testing involved a skill test, which assessed serving performance in a closed setting, and a transfer test, which assessed serving performance in a match-play setting. No significant contextual interference differences were observed with respect to practice performance. However, analysis of pre- and post-test serve performance revealed significant Group × Time interactions. The Moderate CI group showed no change in serving performance (service displacement from the T) from pre- to post-test in the skill test, but did display improvements in the transfer test. Conversely, the Low CI group improved serving performance (service displacement from the T) in the skill test but not the transfer test. Results suggest that the typical contextual interference effect is less clear when practicing a complex motor skill, at least with the tennis serve skill evaluated here. We encourage researchers and applied sport scientists to use our statistical metric to measure contextual interference. PMID:29163306
Henderson, Joseph W; Kane, Sarah M; Mangel, Jeffrey M; Kikano, Elias G; Garibay, Jorge A; Pollard, Robert R; Mahajan, Sangeeta T; Debanne, Sara M; Hijaz, Adonis K
2018-06-01
The cough stress test is a common and accepted tool to evaluate stress urinary incontinence but there is no agreement on how the test should be performed. We assessed the diagnostic ability of different cough stress tests performed when varying patient position and bladder volume using urodynamic stress urinary incontinence as the gold standard. The 24-hour pad test was also evaluated. We recruited women who presented to specialty outpatient clinics with the complaint of urinary incontinence and who were recommended to undergo urodynamic testing. A total of 140 patients were randomized to 4 cough stress test groups, including group 1-a comfortably full bladder, group 2-an empty bladder, group 3- a bladder infused with 200 cc saline and group 4-a bladder filled to half functional capacity. The sequence of standing and sitting was randomly assigned. The groups were compared by 1-way ANOVA or the generalized Fisher exact test. The κ statistic was used to evaluate agreement between the sitting and standing positions. The 95% CIs of sensitivity and specificity were calculated using the Wilson method. ROC analysis was done to evaluate the performance of the 24-hour pad test. The cough stress test performed with a bladder filled to half functional capacity was the best performing test with 83% sensitivity and 90% specificity. There was no statistically significant evidence that the sensitivity or specificity of 1 cough stress test differed from that of the others. The pad test had no significant predictive ability to diagnose urodynamic stress urinary incontinence (AUC 0.60, p = 0.08). Cough stress tests were accurate to diagnose urodynamic stress urinary incontinence. The 24-hour pad test was not predictive of urodynamic stress urinary incontinence and not helpful when used in conjunction with the cough stress test. Copyright © 2018 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.
permGPU: Using graphics processing units in RNA microarray association studies.
Shterev, Ivo D; Jung, Sin-Ho; George, Stephen L; Owzar, Kouros
2010-06-16
Many analyses of microarray association studies involve permutation, bootstrap resampling and cross-validation, that are ideally formulated as embarrassingly parallel computing problems. Given that these analyses are computationally intensive, scalable approaches that can take advantage of multi-core processor systems need to be developed. We have developed a CUDA based implementation, permGPU, that employs graphics processing units in microarray association studies. We illustrate the performance and applicability of permGPU within the context of permutation resampling for a number of test statistics. An extensive simulation study demonstrates a dramatic increase in performance when using permGPU on an NVIDIA GTX 280 card compared to an optimized C/C++ solution running on a conventional Linux server. permGPU is available as an open-source stand-alone application and as an extension package for the R statistical environment. It provides a dramatic increase in performance for permutation resampling analysis in the context of microarray association studies. The current version offers six test statistics for carrying out permutation resampling analyses for binary, quantitative and censored time-to-event traits.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
NASA Astrophysics Data System (ADS)
Otake, Y.; Murphy, R. J.; Grupp, R. B.; Sato, Y.; Taylor, R. H.; Armand, M.
2015-03-01
A robust atlas-to-subject registration using a statistical deformation model (SDM) is presented. The SDM uses statistics of voxel-wise displacement learned from pre-computed deformation vectors of a training dataset. This allows an atlas instance to be directly translated into an intensity volume and compared with a patient's intensity volume. Rigid and nonrigid transformation parameters were simultaneously optimized via the Covariance Matrix Adaptation - Evolutionary Strategy (CMA-ES), with image similarity used as the objective function. The algorithm was tested on CT volumes of the pelvis from 55 female subjects. A performance comparison of the CMA-ES and Nelder-Mead downhill simplex optimization algorithms with the mutual information and normalized cross correlation similarity metrics was conducted. Simulation studies using synthetic subjects were performed, as well as leave-one-out cross validation studies. Both studies suggested that mutual information and CMA-ES achieved the best performance. The leave-one-out test demonstrated 4.13 mm error with respect to the true displacement field, and 26,102 function evaluations in 180 seconds, on average.
Time-compressed speech test in the elderly.
Arceno, Rayana Silva; Scharlach, Renata Coelho
2017-09-28
The present study aimed to evaluate the performance of elderly people in the time-compressed speech test according to the variables ears and order of display, and analyze the types of errors presented by the volunteers. This is an observational, descriptive, quantitative, analytical and primary cross-sectional study involving 22 elderly with normal hearing or mild sensorineural hearing loss between the ages of 60 and 80. The elderly were submitted to the time-compressed speech test with compression ratio of 60%, through the electromechanical time compression method. A list of 50 disyllables was applied to each ear and the initial side was chosen at random. On what concerns to the performance in the test, the elderly fell short in relation to the adults and there was no statistical difference between the ears. It was found statistical evidence of better performance for the second ear in the test. The most mistaken words were the ones initiated with the phonemes /p/ and /d/. The presence of consonant combination in a word also increased the occurrence of mistakes. The elderly have worse performance in the auditory closure ability when assessed by the time-compressed speech test compared to adults. This result suggests that elderly people have difficulty in recognizing speech when this is pronounced in faster rates. Therefore, strategies must be used to facilitate the communicative process, regardless the presence of hearing loss.
Model-Free CUSUM Methods for Person Fit
ERIC Educational Resources Information Center
Armstrong, Ronald D.; Shi, Min
2009-01-01
This article demonstrates the use of a new class of model-free cumulative sum (CUSUM) statistics to detect person fit given the responses to a linear test. The fundamental statistic being accumulated is the likelihood ratio of two probabilities. The detection performance of this CUSUM scheme is compared to other model-free person-fit statistics…
ERIC Educational Resources Information Center
Altonji, Joseph G.; Pierret, Charles R.
A statistical analysis was performed to test the hypothesis that, if profit-maximizing firms have limited information about the general productivity of new workers, they may choose to use easily observable characteristics such as years of education to discriminate statistically among workers. Information about employer learning was obtained by…
Performance of the S - [chi][squared] Statistic for Full-Information Bifactor Models
ERIC Educational Resources Information Center
Li, Ying; Rupp, Andre A.
2011-01-01
This study investigated the Type I error rate and power of the multivariate extension of the S - [chi][squared] statistic using unidimensional and multidimensional item response theory (UIRT and MIRT, respectively) models as well as full-information bifactor (FI-bifactor) models through simulation. Manipulated factors included test length, sample…
Comparing the Lifetimes of Two Brands of Batteries
ERIC Educational Resources Information Center
Dunn, Peter K.
2013-01-01
In this paper, we report a case study that illustrates the importance in interpreting the results from statistical tests, and shows the difference between practical importance and statistical significance. This case study presents three sets of data concerning the performance of two brands of batteries. The data are easy to describe and…
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Central Tire Inflation: Demonstration Tests in the South
R.B. Rummer; C. Ashmore; D.L. Sirois; C.L. Rawlins
1990-01-01
Tests of prototype Central Tire Inflation (CT11 systems were conducted to quantify CT1 performance, road wear, and truck vibration. The CT1 systems were tested in both experimental and operational settings. Changes in the road surface that occurred during the tests could not be statistically attributed to reduced tire pressure. Vibration at the seat base, however,...
Bull, Leona
2007-02-01
The aim of the study was to determine the clinical and perceived effectiveness of the Sunflower therapy in the treatment of childhood dyslexia. The Sunflower therapy includes applied kinesiology, physical manipulation, massage, homeopathy, herbal remedies and neuro-linguistic programming. A multi-centred, randomised controlled trial was undertaken with 70 dyslexic children aged 6-13 years. The research study aimed to test the research hypothesis that dyslexic children 'feel better' and 'perform better' as a result of treatment by the Sunflower therapy. Children in the treatment group and the control group were assessed using a battery of standardised cognitive, Literacy and self-esteem tests before and after the intervention. Parents of children in the treatment group gave feedback on their experience of the Sunflower therapy. Test scores were compared using the Mann Whitney, and Wilcoxon statistical tests. While both groups of children improved in some of their test scores over time, there were no statistically significant improvements in cognitive or Literacy test performance associated with the treatment. However, there were statistically significant improvements in academic self-esteem, and reading self-esteem, for the treatment group. The majority of parents (57.13%) felt that the Sunflower therapy was effective in the treatment of learning difficulties. Further research is required to verify these findings, and should include a control group receiving a dummy treatment to exclude placebo effects.
Lu, Fletcher; Lemonde, Manon
2013-12-01
The objective of this study was to assess if online teaching delivery produces comparable student test performance as the traditional face-to-face approach irrespective of academic aptitude. This study involves a quasi-experimental comparison of student performance in an undergraduate health science statistics course partitioned in two ways. The first partition involves one group of students taught with a traditional face-to-face classroom approach and the other through a completely online instructional approach. The second partition of the subjects categorized the academic aptitude of the students into groups of higher and lower academically performing based on their assignment grades during the course. Controls that were placed on the study to reduce the possibility of confounding variables were: the same instructor taught both groups covering the same subject information, using the same assessment methods and delivered over the same period of time. The results of this study indicate that online teaching delivery is as effective as a traditional face-to-face approach in terms of producing comparable student test performance but only if the student is academically higher performing. For academically lower performing students, the online delivery method produced significantly poorer student test results compared to those lower performing students taught in a traditional face-to-face environment.
NASA Technical Reports Server (NTRS)
Abbey, Craig K.; Eckstein, Miguel P.
2002-01-01
We consider estimation and statistical hypothesis testing on classification images obtained from the two-alternative forced-choice experimental paradigm. We begin with a probabilistic model of task performance for simple forced-choice detection and discrimination tasks. Particular attention is paid to general linear filter models because these models lead to a direct interpretation of the classification image as an estimate of the filter weights. We then describe an estimation procedure for obtaining classification images from observer data. A number of statistical tests are presented for testing various hypotheses from classification images based on some more compact set of features derived from them. As an example of how the methods we describe can be used, we present a case study investigating detection of a Gaussian bump profile.
Hong, Hye Jeong; Kim, Jin Sung; Seo, Wan Seok; Koo, Bon Hoon; Bai, Dai Seg; Jeong, Jin Young
2010-01-01
Objective We investigated executive functions (EFs), as evaluated by the Wisconsin Card Sorting Test (WCST), and other EF between lower grades (LG) and higher grades (HG) in elementary-school-age attention deficit hyperactivity disorder (ADHD) children. Methods We classified a sample of 112 ADHD children into 4 groups (composed of 28 each) based on age (LG vs. HG) and WCST performance [lower vs. higher performance on WCST, defined by the number of completed categories (CC)] Participants in each group were matched according to age, gender, ADHD subtype, and intelligence. We used the Wechsler intelligence Scale for Children 3rd edition to test intelligence and the Computerized Neurocognitive Function Test-IV, which included the WCST, to test EF. Results Comparisons of EFs scores in LG ADHD children showed statistically significant differences in performing digit spans backward, some verbal learning scores, including all memory scores, and Stroop test scores. However, comparisons of EF scores in HG ADHD children did not show any statistically significant differences. Correlation analyses of the CC and EF variables and stepwise multiple regression analysis in LG ADHD children showed a combination of the backward form of the Digit span test and Visual span test in lower-performance ADHD participants significantly predicted the number of CC (R2=0.273, p<0.001). Conclusion This study suggests that the design of any battery of neuropsychological tests for measuring EF in ADHD children should first consider age before interpreting developmental variations and neuropsychological test results. Researchers should consider the dynamics of relationships within EF, as measured by neuropsychological tests. PMID:20927306
Terrestrial photovoltaic cell process testing
NASA Technical Reports Server (NTRS)
Burger, D. R.
1985-01-01
The paper examines critical test parameters, criteria for selecting appropriate tests, and the use of statistical controls and test patterns to enhance PV-cell process test results. The coverage of critical test parameters is evaluated by examining available test methods and then screening these methods by considering the ability to measure those critical parameters which are most affected by the generic process, the cost of the test equipment and test performance, and the feasibility for process testing.
Terrestrial photovoltaic cell process testing
NASA Astrophysics Data System (ADS)
Burger, D. R.
The paper examines critical test parameters, criteria for selecting appropriate tests, and the use of statistical controls and test patterns to enhance PV-cell process test results. The coverage of critical test parameters is evaluated by examining available test methods and then screening these methods by considering the ability to measure those critical parameters which are most affected by the generic process, the cost of the test equipment and test performance, and the feasibility for process testing.
Quiet Eye Training Facilitates Competitive Putting Performance in Elite Golfers
Vine, Samuel J.; Moore, Lee J.; Wilson, Mark R.
2011-01-01
The aim of this study was to examine the effectiveness of a brief quiet eye (QE) training intervention aimed at optimizing visuomotor control and putting performance of elite golfers under pressure, and in real competition. Twenty-two elite golfers (mean handicap 2.7) recorded putting statistics over 10 rounds of competitive golf before attending training individually. Having been randomly assigned to either a QE training or Control group, participants were fitted with an Applied Science Laboratories Mobile Eye tracker and performed 20 baseline (pre-test) putts from 10 ft. Training consisted of video feedback of their gaze behavior while they completed 20 putts; however the QE-trained group received additional instructions related to maintaining a longer QE period. Participants then recorded their putting statistics over a further 10 competitive rounds and re-visited the laboratory for retention and pressure tests of their visuomotor control and putting performance. Overall, the results were supportive of the efficacy of the QE training intervention. QE duration predicted 43% of the variance in putting performance, underlying its critical role in the visuomotor control of putting. The QE-trained group maintained their optimal QE under pressure conditions, whereas the Control group experienced reductions in QE when anxious, with subsequent effects on performance. Although their performance was similar in the pre-test, the QE-trained group holed more putts and left the ball closer to the hole on missed putts than their Control group counterparts in the pressure test. Importantly, these advantages transferred to the golf course, where QE-trained golfers made 1.9 fewer putts per round, compared to pre-training, whereas the Control group showed no change in their putting statistics. These results reveal that QE training, incorporated into a pre-shot routine, is an effective intervention to help golfers maintain control when anxious. PMID:21713182
Color stability and degree of cure of direct composite restoratives after accelerated aging.
Sarafianou, Aspasia; Iosifidou, Soultana; Papadopoulos, Triantafillos; Eliades, George
2007-01-01
This study evaluated the color changes and amount of remaining C = C bonds (%RDB) in three dental composites after hydrothermal- and photoaging. The materials tested were Estelite sigma, Filtek Supreme and Tetric Ceram. Specimens were fabricated from each material and subjected to L* a* b* colorimetry and FTIR spectroscopy before and after aging. Statistical evaluation of the deltaL,* deltaa,* deltab,* deltaE and %deltaRDB data was performed by one-way ANOVA and Tukey's test. The %RDB data before and after aging were statistically analyzed using two-way ANOVA and Student-Newman-Keuls test. In all cases an alpha = 0.05 significance level was used. No statistically significant differences were found in deltaL*, deltaa*, deltaE and %deltaRDB among the materials tested. Tetric Ceram demonstrated a significant difference in deltab*. All the materials showed visually perceptible (deltaE >1) but clinically acceptable values (deltaE < 3.3). Within each material group, statistically significant differences in %RDB were noticed before and after aging (p < 0.05). Filtek Supreme presented the lowest %RDB before aging, with Tetric Ceram presenting the lowest %RDB after aging (p < 0.05). The %deltaRDB mean values were statistically significantly different among all the groups tested. No correlation was found between deltaE and %deltaRDB.
2000-08-01
luminance performance and aviation, many aviators develop ametropias refractive error having comparable effects on during their careers. We were... statistically (0.04 logMAR, the non-aviator group. Separate investigators at p=0.01), but not clinically significant (ə/2 line different research facilities... statistically significant (0.11 ± 0.1 logCS, t=4.0, sensitivity on the SLCT decreased for the aviator pɘ.001), yet there is significant overlap group at a
Khng, Kiat Hui
2017-11-01
A pre-test/post-test, intervention-versus-control experimental design was used to examine the effects, mechanisms and moderators of deep breathing on state anxiety and test performance in 122 Primary 5 students. Taking deep breaths before a timed math test significantly reduced self-reported feelings of anxiety and improved test performance. There was a statistical trend towards greater effectiveness in reducing state anxiety for boys compared to girls, and in enhancing test performance for students with higher autonomic reactivity in test-like situations. The latter moderation was significant when comparing high-versus-low autonomic reactivity groups. Mediation analyses suggest that deep breathing reduces state anxiety in test-like situations, creating a better state-of-mind by enhancing the regulation of adaptive-maladaptive thoughts during the test, allowing for better performance. The quick and simple technique can be easily learnt and effectively applied by most children to immediately alleviate some of the adverse effects of test anxiety on psychological well-being and academic performance.
ERIC Educational Resources Information Center
Hess, Richard Wayne
Stability of performance on a criterion referenced reading test was examined for 413 students in grades one through six. The test, which measures 367 behavioral reading objectives, was administered twice to each student, with an interval of at least three weeks between the first and second administrations. Three statistical indices of permanence…
ERIC Educational Resources Information Center
De Ball, Suzanne; Sullivan, Kathleen; Horine, Julie; Duncan, William K.; Replogle, William
2002-01-01
Comapred University of Mississippi dental student scores on the Dental Admission Test (DAT) and Part I of the National Board Dental Examinations (NBDE) and found that DAT reading comprehension was a statistically significant predictor of all four subtests of the NBDE. Also found that DAT biology and organic chemistry scores were predictors of NBDE…
Ho, Lindsey A; Lange, Ethan M
2010-12-01
Genome-wide association (GWA) studies are a powerful approach for identifying novel genetic risk factors associated with human disease. A GWA study typically requires the inclusion of thousands of samples to have sufficient statistical power to detect single nucleotide polymorphisms that are associated with only modest increases in risk of disease given the heavy burden of a multiple test correction that is necessary to maintain valid statistical tests. Low statistical power and the high financial cost of performing a GWA study remains prohibitive for many scientific investigators anxious to perform such a study using their own samples. A number of remedies have been suggested to increase statistical power and decrease cost, including the utilization of free publicly available genotype data and multi-stage genotyping designs. Herein, we compare the statistical power and relative costs of alternative association study designs that use cases and screened controls to study designs that are based only on, or additionally include, free public control genotype data. We describe a novel replication-based two-stage study design, which uses free public control genotype data in the first stage and follow-up genotype data on case-matched controls in the second stage that preserves many of the advantages inherent when using only an epidemiologically matched set of controls. Specifically, we show that our proposed two-stage design can substantially increase statistical power and decrease cost of performing a GWA study while controlling the type-I error rate that can be inflated when using public controls due to differences in ancestry and batch genotype effects.
Effect of non-normality on test statistics for one-way independent groups designs.
Cribbie, Robert A; Fiksenbaum, Lisa; Keselman, H J; Wilcox, Rand R
2012-02-01
The data obtained from one-way independent groups designs is typically non-normal in form and rarely equally variable across treatment populations (i.e., population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e., the analysis of variance F test) typically provides invalid results (e.g., too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non-normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e., trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non-normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non-normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non-normal. © 2011 The British Psychological Society.
Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J.; Olson, Don; Weiss, Don
2017-01-01
The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method’s implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System’s C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis. PMID:28886112
Mathes, Robert W; Lall, Ramona; Levin-Rector, Alison; Sell, Jessica; Paladini, Marc; Konty, Kevin J; Olson, Don; Weiss, Don
2017-01-01
The New York City Department of Health and Mental Hygiene has operated an emergency department syndromic surveillance system since 2001, using temporal and spatial scan statistics run on a daily basis for cluster detection. Since the system was originally implemented, a number of new methods have been proposed for use in cluster detection. We evaluated six temporal and four spatial/spatio-temporal detection methods using syndromic surveillance data spiked with simulated injections. The algorithms were compared on several metrics, including sensitivity, specificity, positive predictive value, coherence, and timeliness. We also evaluated each method's implementation, programming time, run time, and the ease of use. Among the temporal methods, at a set specificity of 95%, a Holt-Winters exponential smoother performed the best, detecting 19% of the simulated injects across all shapes and sizes, followed by an autoregressive moving average model (16%), a generalized linear model (15%), a modified version of the Early Aberration Reporting System's C2 algorithm (13%), a temporal scan statistic (11%), and a cumulative sum control chart (<2%). Of the spatial/spatio-temporal methods we tested, a spatial scan statistic detected 3% of all injects, a Bayes regression found 2%, and a generalized linear mixed model and a space-time permutation scan statistic detected none at a specificity of 95%. Positive predictive value was low (<7%) for all methods. Overall, the detection methods we tested did not perform well in identifying the temporal and spatial clusters of cases in the inject dataset. The spatial scan statistic, our current method for spatial cluster detection, performed slightly better than the other tested methods across different inject magnitudes and types. Furthermore, we found the scan statistics, as applied in the SaTScan software package, to be the easiest to program and implement for daily data analysis.
Sense of rhythm does not differentiate professional hurdlers from non-athletes.
Skowronek, Tomasz; Słomka, Kajetan; Juras, Grzegorz; Szade, Bartlomiej
2013-08-01
The importance of rhythm and specific endurance capabilities were examined in the technical skill and performance of hurdle runners. Additionally, interaction effects among rhythm, anaerobic fitness, and body constitution were analyzed. Seven 18-year-old members of the Polish Junior National Team in 110 m hurdles and 8 age-matched controls who were non-athletes participated. Movement coordination tests (rhythm and differentiation tests) and an anaerobic fitness test were performed. There were no statistically significant differences between the athletes and the control group on the coordination or rhythm test variables. No support was found for the hypothesis that a hurdler's timing ability influences performance.
Engler-Hamm, Daniel; Cheung, Wai S; Yen, Alec; Stark, Paul C; Griffin, Terrence
2011-03-01
The aim of this single-masked, randomized controlled clinical trial is to compare hard and soft tissue changes after ridge preservation performed with (control, RPc) and without (test, RPe) primary soft tissue closure in a split-mouth design. Eleven patients completed this 6-month trial. Extraction and ridge preservation were performed using a composite bone graft of inorganic bovine-derived hydroxyapatite matrix and cell binding peptide P-15 (ABM/P-15), demineralized freeze-dried bone allograft, and a copolymer bioabsorbable membrane. Primary wound closure was achieved on the control sites (RPc), whereas test sites (RPe) left the membrane exposed. Pocket probing depth on adjacent teeth, repositioning of the mucogingival junction, bone width, bone fill, and postoperative discomfort were assessed. Bone cores were obtained for histological examination. Intragroup analyses for both groups demonstrated statistically significant mean reductions in probing depth (RPc: 0.42 mm, P = 0.012; RPe: 0.25 mm, P = 0.012) and bone width (RPc: 3 mm, P = 0.002; RPe: 3.42 mm, P <0.001). However, intergroup analysis did not find these parameters to be statistically different at 6 months. The test group showed statistically significant mean change in bone fill (7.21 mm; P <0.001). Compared to the control group, the test group showed statistically significant lower mean postoperative discomfort (RPc 4 versus RPe 2; P = 0.002). Histomorphometric analysis showed presence of 0% to 40% of ABM/P-15 and 5% to 20% of new bone formation in both groups. Comparison of clinical variables between the two groups at 6 months revealed that the mucogingival junction was statistically significantly more coronally displaced in the control group than in the test group, with a mean of 3.83 mm versus 1.21 mm (P = 0.002). Ridge preservation without flap advancement preserves more keratinized tissue and has less postoperative discomfort and swelling. Although ridge preservation is performed with either method, ≈27% to 30% of bone width is lost.
Chan, Kwun Chuen Gary; Qin, Jing
2015-10-01
Existing linear rank statistics cannot be applied to cross-sectional survival data without follow-up since all subjects are essentially censored. However, partial survival information are available from backward recurrence times and are frequently collected from health surveys without prospective follow-up. Under length-biased sampling, a class of linear rank statistics is proposed based only on backward recurrence times without any prospective follow-up. When follow-up data are available, the proposed rank statistic and a conventional rank statistic that utilizes follow-up information from the same sample are shown to be asymptotically independent. We discuss four ways to combine these two statistics when follow-up is present. Simulations show that all combined statistics have substantially improved power compared with conventional rank statistics, and a Mantel-Haenszel test performed the best among the proposal statistics. The method is applied to a cross-sectional health survey without follow-up and a study of Alzheimer's disease with prospective follow-up. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
DSN telemetry system performance using a maximum likelihood convolutional decoder
NASA Technical Reports Server (NTRS)
Benjauthrit, B.; Kemp, R. P.
1977-01-01
Results are described of telemetry system performance testing using DSN equipment and a Maximum Likelihood Convolutional Decoder (MCD) for code rates 1/2 and 1/3, constraint length 7 and special test software. The test results confirm the superiority of the rate 1/3 over that of the rate 1/2. The overall system performance losses determined at the output of the Symbol Synchronizer Assembly are less than 0.5 db for both code rates. Comparison of the performance is also made with existing mathematical models. Error statistics of the decoded data are examined. The MCD operational threshold is found to be about 1.96 db.
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less
Linking the Smarter Balanced Assessments to NWEA MAP Assessments
ERIC Educational Resources Information Center
Northwest Evaluation Association, 2015
2015-01-01
Concordance tables have been used for decades to relate scores on different tests measuring similar but distinct constructs. These tables, typically derived from statistical linking procedures, provide a direct link between scores on different tests and serve various purposes. Aside from describing how a score on one test relates to performance on…
Performance of DIMTEST-and NOHARM-Based Statistics for Testing Unidimensionality
ERIC Educational Resources Information Center
Finch, Holmes; Habing, Brian
2007-01-01
This Monte Carlo study compares the ability of the parametric bootstrap version of DIMTEST with three goodness-of-fit tests calculated from a fitted NOHARM model to detect violations of the assumption of unidimensionality in testing data. The effectiveness of the procedures was evaluated for different numbers of items, numbers of examinees,…
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2013 CFR
2013-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2014 CFR
2014-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2011 CFR
2011-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2012 CFR
2012-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
Pharmacy students' test-taking motivation-effort on a low-stakes standardized test.
Waskiewicz, Rhonda A
2011-04-11
To measure third-year pharmacy students' level of motivation while completing the Pharmacy Curriculum Outcomes Assessment (PCOA) administered as a low-stakes test to better understand use of the PCOA as a measure of student content knowledge. Student motivation was manipulated through an incentive (ie, personal letter from the dean) and a process of statistical motivation filtering. Data were analyzed to determine any differences between the experimental and control groups in PCOA test performance, motivation to perform well, and test performance after filtering for low motivation-effort. Incentivizing students diminished the need for filtering PCOA scores for low effort. Where filtering was used, performance scores improved, providing a more realistic measure of aggregate student performance. To ensure that PCOA scores are an accurate reflection of student knowledge, incentivizing and/or filtering for low motivation-effort among pharmacy students should be considered fundamental best practice when the PCOA is administered as a low-stakes test.
Validation of a Dumbbell Body Sway Test in Olympic Air Pistol Shooting
Mon, Daniel; Zakynthinaki, Maria S.; Cordente, Carlos A.; Monroy Antón, Antonio; López Jiménez, David
2014-01-01
We present and validate a test able to provide reliable body sway measurements in air pistol shooting, without the use of a gun. 46 senior male pistol shooters who participated in Spanish air pistol championships participated in the study. Body sway data of two static bipodal balance tests have been compared: during the first test, shooting was simulated by use of a dumbbell, while during the second test the shooters own pistol was used. Both tests were performed the day previous to the competition, during the official training time and at the training stands to simulate competition conditions. The participantś performance was determined as the total score of 60 shots at competition. Apart from the commonly used variables that refer to movements of the shooters centre of pressure (COP), such as COP displacements on the X and Y axes, maximum and average COP velocities and total COP area, the present analysis also included variables that provide information regarding the axes of the COP ellipse (length and angle in respect to X). A strong statistically significant correlation between the two tests was found (with an interclass correlation varying between 0.59 and 0.92). A statistically significant inverse linear correlation was also found between performance and COP movements. The study concludes that dumbbell tests are perfectly valid for measuring body sway by simulating pistol shooting. PMID:24756067
Praskova, E; Voslarova, E; Siroka, Z; Plhalova, L; Macova, S; Marsalek, P; Pistekova, V; Svobodova, Z
2011-01-01
The aim of the study was to compare the acute toxicity of diclofenac to juvenile and embryonic stages of the zebrafish (Danio rerio). Acute toxicity tests were performed on the aquarium fish Danio rerio, which is one of the model organisms most commonly used in toxicity testing. The tests were performed using a semi-static method according to OECD guideline No. 203 (Fish, acute toxicity test). Embryo toxicity tests were performed in zebrafish embryos (Danio rerio) in compliance with OECD No. 212 methodology (Fish, short-term toxicity test on embryo and sac-fry stages). The results were subjected to a probit analysis using the EKO-TOX 5.2 programme to determine 96hLC50 and 144hLC50 (median lethal concentration, 50% mortality after a 96 h or 144 h interval, respectively) values of diclofenac. The statistical significance of the difference between LC50 values in juvenile and embryonic stages of Danio rerio was tested using the Mann-Whitney non-parametric test implemented in the Unistat 5.1 programme. The LC50 mean value of diclofenac was 166.6 +/- 9.8 mg/L in juvenile Danio rerio, and 6.11 +/- 2.48 mg/L in embryonic stages of Danio rerio. The study demonstrated a statistically higher sensitivity to diclofenac (P < 0.05) in embryonic stages compared to the juvenile fish.
A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.
Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S
2016-01-01
Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.
Reproducibility-optimized test statistic for ranking genes in microarray studies.
Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero
2008-01-01
A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.
Testing the Predictive Power of Coulomb Stress on Aftershock Sequences
NASA Astrophysics Data System (ADS)
Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.
2009-12-01
Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.
ERIC Educational Resources Information Center
Goldhaber, Dan; Gratz, Trevor; Theobald, Roddy
2016-01-01
We investigate the predictive validity of teacher credential test scores for student performance in secondary STEM classrooms in Washington state. After replicating earlier findings that teacher basic skills licensure test scores are a modest and statistically significant predictor of student math test score gains in elementary grades, we focus on…
ERIC Educational Resources Information Center
Ladyshewsky, Richard K.
2015-01-01
This research explores differences in multiple choice test (MCT) scores in a cohort of post-graduate students enrolled in a management and leadership course. A total of 250 students completed the MCT in either a supervised in-class paper and pencil test or an unsupervised online test. The only statistically significant difference between the nine…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Berman, D.W.; Allen, B.C.; Van Landingham, C.B.
1998-12-31
The decision rules commonly employed to determine the need for cleanup are evaluated both to identify conditions under which they lead to erroneous conclusions and to quantify the rate that such errors occur. Their performance is also compared with that of other applicable decision rules. The authors based the evaluation of decision rules on simulations. Results are presented as power curves. These curves demonstrate that the degree of statistical control achieved is independent of the form of the null hypothesis. The loss of statistical control that occurs when a decision rule is applied to a data set that does notmore » satisfy the rule`s validity criteria is also clearly demonstrated. Some of the rules evaluated do not offer the formal statistical control that is an inherent design feature of other rules. Nevertheless, results indicate that such informal decision rules may provide superior overall control of error rates, when their application is restricted to data exhibiting particular characteristics. The results reported here are limited to decision rules applied to uncensored and lognormally distributed data. To optimize decision rules, it is necessary to evaluate their behavior when applied to data exhibiting a range of characteristics that bracket those common to field data. The performance of decision rules applied to data sets exhibiting a broader range of characteristics is reported in the second paper of this study.« less
Spectral gene set enrichment (SGSE).
Frost, H Robert; Li, Zhigang; Moore, Jason H
2015-03-03
Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes the statistical association between gene sets and principal components (PCs) using our principal component gene set enrichment (PCGSE) method. The overall statistical association between each gene set and the spectral structure of the data is then computed by combining the PC-level p-values using the weighted Z-method with weights set to the PC variance scaled by Tracy-Widom test p-values. Using simulated data, we show that the SGSE algorithm can accurately recover spectral features from noisy data. To illustrate the utility of our method on real data, we demonstrate the superior performance of the SGSE method relative to standard cluster-based techniques for testing the association between MSigDB gene sets and the variance structure of microarray gene expression data. Unsupervised gene set testing can provide important information about the biological signal held in high-dimensional genomic data sets. Because it uses the association between gene sets and samples PCs to generate a measure of unsupervised enrichment, the SGSE method is independent of cluster or network creation algorithms and, most importantly, is able to utilize the statistical significance of PC eigenvalues to ignore elements of the data most likely to represent noise.
2011-01-01
Background Safety assessment of genetically modified organisms is currently often performed by comparative evaluation. However, natural variation of plant characteristics between commercial varieties is usually not considered explicitly in the statistical computations underlying the assessment. Results Statistical methods are described for the assessment of the difference between a genetically modified (GM) plant variety and a conventional non-GM counterpart, and for the assessment of the equivalence between the GM variety and a group of reference plant varieties which have a history of safe use. It is proposed to present the results of both difference and equivalence testing for all relevant plant characteristics simultaneously in one or a few graphs, as an aid for further interpretation in safety assessment. A procedure is suggested to derive equivalence limits from the observed results for the reference plant varieties using a specific implementation of the linear mixed model. Three different equivalence tests are defined to classify any result in one of four equivalence classes. The performance of the proposed methods is investigated by a simulation study, and the methods are illustrated on compositional data from a field study on maize grain. Conclusions A clear distinction of practical relevance is shown between difference and equivalence testing. The proposed tests are shown to have appropriate performance characteristics by simulation, and the proposed simultaneous graphical representation of results was found to be helpful for the interpretation of results from a practical field trial data set. PMID:21324199
Antunes, Amanda H; Alberton, Cristine L; Finatto, Paula; Pinto, Stephanie S; Cadore, Eduardo L; Zaffari, Paula; Kruel, Luiz F M
2015-01-01
Maximal tests conducted on land are not suitable for the prescription of aquatic exercises, which makes it difficult to optimize the intensity of water aerobics classes. The aim of the present study was to evaluate the maximal and anaerobic threshold cardiorespiratory responses to 6 water aerobics exercises. Volunteers performed 3 of the exercises in the sagittal plane and 3 in the frontal plane. Twelve active female volunteers (aged 24 ± 2 years) performed 6 maximal progressive test sessions. Throughout the exercise tests, we measured heart rate (HR) and oxygen consumption (VO2). We randomized all sessions with a minimum interval of 48 hr between each session. For statistical analysis, we used repeated-measures 1-way analysis of variance. Regarding the maximal responses, for the peak VO2, abductor hop and jumping jacks (JJ) showed significantly lower values than frontal kick and cross-country skiing (CCS; p < .001; partial η(2) = .509), while for the peak HR, JJ showed statistically significantly lower responses compared with stationary running and CCS (p < .001; partial η(2) = .401). At anaerobic threshold intensity expressed as the percentage of the maximum values, no statistically significant differences were found among exercises. Cardiorespiratory responses are directly associated with the muscle mass involved in the exercise. Thus, it is worth emphasizing the importance of performing a maximal test that is specific to the analyzed exercise so the prescription of the intensity can be safer and valid.
NASA Technical Reports Server (NTRS)
Edwards, B. F.; Waligora, J. M.; Horrigan, D. J., Jr.
1985-01-01
This analysis was done to determine whether various decompression response groups could be characterized by the pooled nitrogen (N2) washout profiles of the group members, pooling individual washout profiles provided a smooth time dependent function of means representative of the decompression response group. No statistically significant differences were detected. The statistical comparisons of the profiles were performed by means of univariate weighted t-test at each 5 minute profile point, and with levels of significance of 5 and 10 percent. The estimated powers of the tests (i.e., probabilities) to detect the observed differences in the pooled profiles were of the order of 8 to 30 percent.
Reproducible detection of disease-associated markers from gene expression data.
Omae, Katsuhiro; Komori, Osamu; Eguchi, Shinto
2016-08-18
Detection of disease-associated markers plays a crucial role in gene screening for biological studies. Two-sample test statistics, such as the t-statistic, are widely used to rank genes based on gene expression data. However, the resultant gene ranking is often not reproducible among different data sets. Such irreproducibility may be caused by disease heterogeneity. When we divided data into two subsets, we found that the signs of the two t-statistics were often reversed. Focusing on such instability, we proposed a sign-sum statistic that counts the signs of the t-statistics for all possible subsets. The proposed method excludes genes affected by heterogeneity, thereby improving the reproducibility of gene ranking. We compared the sign-sum statistic with the t-statistic by a theoretical evaluation of the upper confidence limit. Through simulations and applications to real data sets, we show that the sign-sum statistic exhibits superior performance. We derive the sign-sum statistic for getting a robust gene ranking. The sign-sum statistic gives more reproducible ranking than the t-statistic. Using simulated data sets we show that the sign-sum statistic excludes hetero-type genes well. Also for the real data sets, the sign-sum statistic performs well in a viewpoint of ranking reproducibility.
A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring.
Takahashi, Kunihiko; Kulldorff, Martin; Tango, Toshiro; Yih, Katherine
2008-04-11
Early detection of disease outbreaks enables public health officials to implement disease control and prevention measures at the earliest possible time. A time periodic geographical disease surveillance system based on a cylindrical space-time scan statistic has been used extensively for disease surveillance along with the SaTScan software. In the purely spatial setting, many different methods have been proposed to detect spatial disease clusters. In particular, some spatial scan statistics are aimed at detecting irregularly shaped clusters which may not be detected by the circular spatial scan statistic. Based on the flexible purely spatial scan statistic, we propose a flexibly shaped space-time scan statistic for early detection of disease outbreaks. The performance of the proposed space-time scan statistic is compared with that of the cylindrical scan statistic using benchmark data. In order to compare their performances, we have developed a space-time power distribution by extending the purely spatial bivariate power distribution. Daily syndromic surveillance data in Massachusetts, USA, are used to illustrate the proposed test statistic. The flexible space-time scan statistic is well suited for detecting and monitoring disease outbreaks in irregularly shaped areas.
Efficient Blockwise Permutation Tests Preserving Exchangeability
Zhou, Chunxiao; Zwilling, Chris E.; Calhoun, Vince D.; Wang, Michelle Y.
2014-01-01
In this paper, we present a new blockwise permutation test approach based on the moments of the test statistic. The method is of importance to neuroimaging studies. In order to preserve the exchangeability condition required in permutation tests, we divide the entire set of data into certain exchangeability blocks. In addition, computationally efficient moments-based permutation tests are performed by approximating the permutation distribution of the test statistic with the Pearson distribution series. This involves the calculation of the first four moments of the permutation distribution within each block and then over the entire set of data. The accuracy and efficiency of the proposed method are demonstrated through simulated experiment on the magnetic resonance imaging (MRI) brain data, specifically the multi-site voxel-based morphometry analysis from structural MRI (sMRI). PMID:25289113
A nonparametric smoothing method for assessing GEE models with longitudinal binary data.
Lin, Kuo-Chin; Chen, Yi-Ju; Shyr, Yu
2008-09-30
Studies involving longitudinal binary responses are widely applied in the health and biomedical sciences research and frequently analyzed by generalized estimating equations (GEE) method. This article proposes an alternative goodness-of-fit test based on the nonparametric smoothing approach for assessing the adequacy of GEE fitted models, which can be regarded as an extension of the goodness-of-fit test of le Cessie and van Houwelingen (Biometrics 1991; 47:1267-1282). The expectation and approximate variance of the proposed test statistic are derived. The asymptotic distribution of the proposed test statistic in terms of a scaled chi-squared distribution and the power performance of the proposed test are discussed by simulation studies. The testing procedure is demonstrated by two real data. Copyright (c) 2008 John Wiley & Sons, Ltd.
Retrieving Essential Material at the End of Lectures Improves Performance on Statistics Exams
ERIC Educational Resources Information Center
Lyle, Keith B.; Crawford, Nicole A.
2011-01-01
At the end of each lecture in a statistics for psychology course, students answered a small set of questions that required them to retrieve information from the same day's lecture. These exercises constituted retrieval practice for lecture material subsequently tested on four exams throughout the course. This technique is called the PUREMEM…
ERIC Educational Resources Information Center
Nevitt, Jonathan; Hancock, Gregory R.
2001-01-01
Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
Critical Thinking Skills of U.S. Air Force Senior and Intermediate Developmental Education Students
2016-02-16
SAASS), and Air War College (AWC). T-tests indicated no statistically significant difference in the CT skills of the sample of ACSC and AWC students...hypothesis that there was no statistically significant difference in the CT skills of IDE and SDE students. SAASS, as a more selective advanced studies...potential to develop CT skills, concluding, “students in the experimental group performed at a statistically significantly higher level than students in
Cognitive predictors of balance in Parkinson's disease.
Fernandes, Ângela; Mendes, Andreia; Rocha, Nuno; Tavares, João Manuel R S
2016-06-01
Postural instability is one of the most incapacitating symptoms of Parkinson's disease (PD) and appears to be related to cognitive deficits. This study aims to determine the cognitive factors that can predict deficits in static and dynamic balance in individuals with PD. A sociodemographic questionnaire characterized 52 individuals with PD for this work. The Trail Making Test, Rule Shift Cards Test, and Digit Span Test assessed the executive functions. The static balance was assessed using a plantar pressure platform, and dynamic balance was based on the Timed Up and Go Test. The results were statistically analysed using SPSS Statistics software through linear regression analysis. The results show that a statistically significant model based on cognitive outcomes was able to explain the variance of motor variables. Also, the explanatory value of the model tended to increase with the addition of individual and clinical variables, although the resulting model was not statistically significant The model explained 25-29% of the variability of the Timed Up and Go Test, while for the anteroposterior displacement it was 23-34%, and for the mediolateral displacement it was 24-39%. From the findings, we conclude that the cognitive performance, especially the executive functions, is a predictor of balance deficit in individuals with PD.
Accuracy Evaluation of the Unified P-Value from Combining Correlated P-Values
Alves, Gelio; Yu, Yi-Kuo
2014-01-01
Meta-analysis methods that combine -values into a single unified -value are frequently employed to improve confidence in hypothesis testing. An assumption made by most meta-analysis methods is that the -values to be combined are independent, which may not always be true. To investigate the accuracy of the unified -value from combining correlated -values, we have evaluated a family of statistical methods that combine: independent, weighted independent, correlated, and weighted correlated -values. Statistical accuracy evaluation by combining simulated correlated -values showed that correlation among -values can have a significant effect on the accuracy of the combined -value obtained. Among the statistical methods evaluated those that weight -values compute more accurate combined -values than those that do not. Also, statistical methods that utilize the correlation information have the best performance, producing significantly more accurate combined -values. In our study we have demonstrated that statistical methods that combine -values based on the assumption of independence can produce inaccurate -values when combining correlated -values, even when the -values are only weakly correlated. Therefore, to prevent from drawing false conclusions during hypothesis testing, our study advises caution be used when interpreting the -value obtained from combining -values of unknown correlation. However, when the correlation information is available, the weighting-capable statistical method, first introduced by Brown and recently modified by Hou, seems to perform the best amongst the methods investigated. PMID:24663491
Equivalence of the Color Trails Test and Trail Making Test in nonnative English-speakers.
Dugbartey, A T; Townes, B D; Mahurin, R K
2000-07-01
The Color Trails Test (CTT) has been described as a culture-fair test of visual attention, graphomotor sequencing, and effortful executive processing abilities relative to the Trail Making Test (TMT). In this study, the equivalence of the TMT and the CTT among a group of 64 bilingual Turkish university students was examined. No difference in performance on the CTT-1 and TMT Part A was found, suggesting functionally equivalent performance across both tasks. In contrast, the statistically significant differences in performance on CTT-2 and TMT Part B, as well as the interference indices for both tests, were interpreted as providing evidence for task nonequivalence of the CTT-2 and TMT Part B. Results have implications for both psychometric test development and clinical cultural neuropsychology.
Long, Brandon R.; Rinaldo, Steven G.; Gallagher, Kevin G.; ...
2016-11-09
Coin-cells are often the test format of choice for laboratories engaged in battery research and development as they provide a convenient platform for rapid testing of new materials on a small scale. However, reliable, reproducible data via the coin-cell format is inherently difficult, particularly in the full-cell configuration. In addition, statistical evaluation to prove the consistency and reliability of such data is often neglected. Herein we report on several studies aimed at formalizing physical process parameters and coin-cell construction related to full cells. Statistical analysis and performance benchmarking approaches are advocated as a means to more confidently track changes inmore » cell performance. Finally, we show that trends in the electrochemical data obtained from coin-cells can be reliable and informative when standardized approaches are implemented in a consistent manner.« less
The Influence of Cognitive Reserve on Recovery from Traumatic Brain Injury.
Donders, Jacobus; Stout, Jacob
2018-04-12
we sought to determine the degree to which cognitive reserve, as assessed by the Test of Premorbid Functioning in combination with demographic variables, could act as a buffer against the effect of traumatic brain injury (TBI) on cognitive test performance. retrospective analysis of a cohort of 121 persons with TBI who completed the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV) within 1-12 months after injury. regression analyses indicated that cognitive reserve was a statistically significant predictor of all postinjury WAIS-IV factor index scores, after controlling for various premorbid and comorbid confounding variables. Only for Processing Speed did injury severity make an additional statistically significant contribution to the prediction model. cognitive reserve has a protective effect with regard to the impact of TBI on cognitive test performance but this effect is imperfect and does not completely negate the effect of injury severity.
Vadapalli, Sriharsha Babu; Atluri, Kaleswararao; Putcha, Madhu Sudhan; Kondreddi, Sirisha; Kumar, N. Suman; Tadi, Durga Prasad
2016-01-01
Objectives: This in vitro study was designed to compare polyvinyl-siloxane (PVS) monophase and polyether (PE) monophase materials under dry and moist conditions for properties such as surface detail reproduction, dimensional stability, and gypsum compatibility. Materials and Methods: Surface detail reproduction was evaluated using two criteria. Dimensional stability was evaluated according to American Dental Association (ADA) specification no. 19. Gypsum compatibility was assessed by two criteria. All the samples were evaluated, and the data obtained were analyzed by a two-way analysis of variance (ANOVA) and Pearson's Chi-square tests. Results: When surface detail reproduction was evaluated with modification of ADA specification no. 19, both the groups under the two conditions showed no significant difference statistically. When evaluated macroscopically both the groups showed statistically significant difference. Results for dimensional stability showed that the deviation from standard was significant among the two groups, where Aquasil group showed significantly more deviation compared to Impregum group (P < 0.001). Two conditions also showed significant difference, with moist conditions showing significantly more deviation compared to dry condition (P < 0.001). The results of gypsum compatibility when evaluated with modification of ADA specification no. 19 and by giving grades to the casts for both the groups and under two conditions showed no significant difference statistically. Conclusion: Regarding dimensional stability, both impregum and aquasil performed better in dry condition than in moist; impregum performed better than aquasil in both the conditions. When tested for surface detail reproduction according to ADA specification, under dry and moist conditions both of them performed almost equally. When tested according to macroscopic evaluation, impregum and aquasil performed significantly better in dry condition compared to moist condition. In dry condition, both the materials performed almost equally. In moist condition, aquasil performed significantly better than impregum. Regarding gypsum compatibility according to ADA specification, in dry condition both the materials performed almost equally, and in moist condition aquasil performed better than impregum. When tested by macroscopic evaluation, impregum performed better than aquasil in both the conditions. PMID:27583217
Vadapalli, Sriharsha Babu; Atluri, Kaleswararao; Putcha, Madhu Sudhan; Kondreddi, Sirisha; Kumar, N Suman; Tadi, Durga Prasad
2016-01-01
This in vitro study was designed to compare polyvinyl-siloxane (PVS) monophase and polyether (PE) monophase materials under dry and moist conditions for properties such as surface detail reproduction, dimensional stability, and gypsum compatibility. Surface detail reproduction was evaluated using two criteria. Dimensional stability was evaluated according to American Dental Association (ADA) specification no. 19. Gypsum compatibility was assessed by two criteria. All the samples were evaluated, and the data obtained were analyzed by a two-way analysis of variance (ANOVA) and Pearson's Chi-square tests. When surface detail reproduction was evaluated with modification of ADA specification no. 19, both the groups under the two conditions showed no significant difference statistically. When evaluated macroscopically both the groups showed statistically significant difference. Results for dimensional stability showed that the deviation from standard was significant among the two groups, where Aquasil group showed significantly more deviation compared to Impregum group (P < 0.001). Two conditions also showed significant difference, with moist conditions showing significantly more deviation compared to dry condition (P < 0.001). The results of gypsum compatibility when evaluated with modification of ADA specification no. 19 and by giving grades to the casts for both the groups and under two conditions showed no significant difference statistically. Regarding dimensional stability, both impregum and aquasil performed better in dry condition than in moist; impregum performed better than aquasil in both the conditions. When tested for surface detail reproduction according to ADA specification, under dry and moist conditions both of them performed almost equally. When tested according to macroscopic evaluation, impregum and aquasil performed significantly better in dry condition compared to moist condition. In dry condition, both the materials performed almost equally. In moist condition, aquasil performed significantly better than impregum. Regarding gypsum compatibility according to ADA specification, in dry condition both the materials performed almost equally, and in moist condition aquasil performed better than impregum. When tested by macroscopic evaluation, impregum performed better than aquasil in both the conditions.
Embedded performance validity testing in neuropsychological assessment: Potential clinical tools.
Rickards, Tyler A; Cranston, Christopher C; Touradji, Pegah; Bechtold, Kathleen T
2018-01-01
The article aims to suggest clinically-useful tools in neuropsychological assessment for efficient use of embedded measures of performance validity. To accomplish this, we integrated available validity-related and statistical research from the literature, consensus statements, and survey-based data from practicing neuropsychologists. We provide recommendations for use of 1) Cutoffs for embedded performance validity tests including Reliable Digit Span, California Verbal Learning Test (Second Edition) Forced Choice Recognition, Rey-Osterrieth Complex Figure Test Combination Score, Wisconsin Card Sorting Test Failure to Maintain Set, and the Finger Tapping Test; 2) Selecting number of performance validity measures to administer in an assessment; and 3) Hypothetical clinical decision-making models for use of performance validity testing in a neuropsychological assessment collectively considering behavior, patient reporting, and data indicating invalid or noncredible performance. Performance validity testing helps inform the clinician about an individual's general approach to tasks: response to failure, task engagement and persistence, compliance with task demands. Data-driven clinical suggestions provide a resource to clinicians and to instigate conversation within the field to make more uniform, testable decisions to further the discussion, and guide future research in this area.
Katki, Hormuzd A; Schiffman, Mark
2018-05-01
Our work involves assessing whether new biomarkers might be useful for cervical-cancer screening across populations with different disease prevalences and biomarker distributions. When comparing across populations, we show that standard diagnostic accuracy statistics (predictive values, risk-differences, Youden's index and Area Under the Curve (AUC)) can easily be misinterpreted. We introduce an intuitively simple statistic for a 2 × 2 table, Mean Risk Stratification (MRS): the average change in risk (pre-test vs. post-test) revealed for tested individuals. High MRS implies better risk separation achieved by testing. MRS has 3 key advantages for comparing test performance across populations with different disease prevalences and biomarker distributions. First, MRS demonstrates that conventional predictive values and the risk-difference do not measure risk-stratification because they do not account for test-positivity rates. Second, Youden's index and AUC measure only multiplicative relative gains in risk-stratification: AUC = 0.6 achieves only 20% of maximum risk-stratification (AUC = 0.9 achieves 80%). Third, large relative gains in risk-stratification might not imply large absolute gains if disease is rare, demonstrating a "high-bar" to justify population-based screening for rare diseases such as cancer. We illustrate MRS by our experience comparing the performance of cervical-cancer screening tests in China vs. the USA. The test with the worst AUC = 0.72 in China (visual inspection with acetic acid) provides twice the risk-stratification (i.e. MRS) of the test with best AUC = 0.83 in the USA (human papillomavirus and Pap cotesting) because China has three times more cervical precancer/cancer. MRS could be routinely calculated to better understand the clinical/public-health implications of standard diagnostic accuracy statistics. Published by Elsevier Inc.
SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries.
Wu, Jemma X; Song, Xiaomin; Pascovici, Dana; Zaw, Thiri; Care, Natasha; Krisp, Christoph; Molloy, Mark P
2016-07-01
The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
SWATH Mass Spectrometry Performance Using Extended Peptide MS/MS Assay Libraries*
Wu, Jemma X.; Song, Xiaomin; Pascovici, Dana; Zaw, Thiri; Care, Natasha; Krisp, Christoph; Molloy, Mark P.
2016-01-01
The use of data-independent acquisition methods such as SWATH for mass spectrometry based proteomics is usually performed with peptide MS/MS assay libraries which enable identification and quantitation of peptide peak areas. Reference assay libraries can be generated locally through information dependent acquisition, or obtained from community data repositories for commonly studied organisms. However, there have been no studies performed to systematically evaluate how locally generated or repository-based assay libraries affect SWATH performance for proteomic studies. To undertake this analysis, we developed a software workflow, SwathXtend, which generates extended peptide assay libraries by integration with a local seed library and delivers statistical analysis of SWATH-quantitative comparisons. We designed test samples using peptides from a yeast extract spiked into peptides from human K562 cell lysates at three different ratios to simulate protein abundance change comparisons. SWATH-MS performance was assessed using local and external assay libraries of varying complexities and proteome compositions. These experiments demonstrated that local seed libraries integrated with external assay libraries achieve better performance than local assay libraries alone, in terms of the number of identified peptides and proteins and the specificity to detect differentially abundant proteins. Our findings show that the performance of extended assay libraries is influenced by the MS/MS feature similarity of the seed and external libraries, while statistical analysis using multiple testing corrections increases the statistical rigor needed when searching against large extended assay libraries. PMID:27161445
Küçük, Fadime; Kara, Bilge; Poyraz, Esra Çoşkuner; İdiman, Egemen
2016-01-01
[Purpose] The aim of this study was to determine the effects of clinical Pilates in multiple sclerosis patients. [Subjects and Methods] Twenty multiple sclerosis patients were enrolled in this study. The participants were divided into two groups as the clinical Pilates and control groups. Cognition (Multiple Sclerosis Functional Composite), balance (Berg Balance Scale), physical performance (timed performance tests, Timed up and go test), tiredness (Modified Fatigue Impact scale), depression (Beck Depression Inventory), and quality of life (Multiple Sclerosis International Quality of Life Questionnaire) were measured before and after treatment in all participants. [Results] There were statistically significant differences in balance, timed performance, tiredness and Multiple Sclerosis Functional Composite tests between before and after treatment in the clinical Pilates group. We also found significant differences in timed performance tests, the Timed up and go test and the Multiple Sclerosis Functional Composite between before and after treatment in the control group. According to the difference analyses, there were significant differences in Multiple Sclerosis Functional Composite and Multiple Sclerosis International Quality of Life Questionnaire scores between the two groups in favor of the clinical Pilates group. There were statistically significant clinical differences in favor of the clinical Pilates group in comparison of measurements between the groups. Clinical Pilates improved cognitive functions and quality of life compared with traditional exercise. [Conclusion] In Multiple Sclerosis treatment, clinical Pilates should be used as a holistic approach by physical therapists. PMID:27134355
Küçük, Fadime; Kara, Bilge; Poyraz, Esra Çoşkuner; İdiman, Egemen
2016-03-01
[Purpose] The aim of this study was to determine the effects of clinical Pilates in multiple sclerosis patients. [Subjects and Methods] Twenty multiple sclerosis patients were enrolled in this study. The participants were divided into two groups as the clinical Pilates and control groups. Cognition (Multiple Sclerosis Functional Composite), balance (Berg Balance Scale), physical performance (timed performance tests, Timed up and go test), tiredness (Modified Fatigue Impact scale), depression (Beck Depression Inventory), and quality of life (Multiple Sclerosis International Quality of Life Questionnaire) were measured before and after treatment in all participants. [Results] There were statistically significant differences in balance, timed performance, tiredness and Multiple Sclerosis Functional Composite tests between before and after treatment in the clinical Pilates group. We also found significant differences in timed performance tests, the Timed up and go test and the Multiple Sclerosis Functional Composite between before and after treatment in the control group. According to the difference analyses, there were significant differences in Multiple Sclerosis Functional Composite and Multiple Sclerosis International Quality of Life Questionnaire scores between the two groups in favor of the clinical Pilates group. There were statistically significant clinical differences in favor of the clinical Pilates group in comparison of measurements between the groups. Clinical Pilates improved cognitive functions and quality of life compared with traditional exercise. [Conclusion] In Multiple Sclerosis treatment, clinical Pilates should be used as a holistic approach by physical therapists.
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-01-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-28
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
NASA Astrophysics Data System (ADS)
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
Statistical and Machine Learning forecasting methods: Concerns and ways forward
Makridakis, Spyros; Assimakopoulos, Vassilios
2018-01-01
Machine Learning (ML) methods have been proposed in the academic literature as alternatives to statistical ones for time series forecasting. Yet, scant evidence is available about their relative performance in terms of accuracy and computational requirements. The purpose of this paper is to evaluate such performance across multiple forecasting horizons using a large subset of 1045 monthly time series used in the M3 Competition. After comparing the post-sample accuracy of popular ML methods with that of eight traditional statistical ones, we found that the former are dominated across both accuracy measures used and for all forecasting horizons examined. Moreover, we observed that their computational requirements are considerably greater than those of statistical methods. The paper discusses the results, explains why the accuracy of ML models is below that of statistical ones and proposes some possible ways forward. The empirical results found in our research stress the need for objective and unbiased ways to test the performance of forecasting methods that can be achieved through sizable and open competitions allowing meaningful comparisons and definite conclusions. PMID:29584784
Paetkau, D; Waits, L P; Clarkson, P L; Craighead, L; Strobeck, C
1997-12-01
A large microsatellite data set from three species of bear (Ursidae) was used to empirically test the performance of six genetic distance measures in resolving relationships at a variety of scales ranging from adjacent areas in a continuous distribution to species that diverged several million years ago. At the finest scale, while some distance measures performed extremely well, statistics developed specifically to accommodate the mutational processes of microsatellites performed relatively poorly, presumably because of the relatively higher variance of these statistics. At the other extreme, no statistic was able to resolve the close sister relationship of polar bears and brown bears from more distantly related pairs of species. This failure is most likely due to constraints on allele distributions at microsatellite loci. At intermediate scales, both within continuous distributions and in comparisons to insular populations of late Pleistocene origin, it was not possible to define the point where linearity was lost for each of the statistics, except that it is clearly lost after relatively short periods of independent evolution. All of the statistics were affected by the amount of genetic diversity within the populations being compared, significantly complicating the interpretation of genetic distance data.
Paetkau, D.; Waits, L. P.; Clarkson, P. L.; Craighead, L.; Strobeck, C.
1997-01-01
A large microsatellite data set from three species of bear (Ursidae) was used to empirically test the performance of six genetic distance measures in resolving relationships at a variety of scales ranging from adjacent areas in a continuous distribution to species that diverged several million years ago. At the finest scale, while some distance measures performed extremely well, statistics developed specifically to accommodate the mutational processes of microsatellites performed relatively poorly, presumably because of the relatively higher variance of these statistics. At the other extreme, no statistic was able to resolve the close sister relationship of polar bears and brown bears from more distantly related pairs of species. This failure is most likely due to constraints on allele distributions at microsatellite loci. At intermediate scales, both within continuous distributions and in comparisons to insular populations of late Pleistocene origin, it was not possible to define the point where linearity was lost for each of the statistics, except that it is clearly lost after relatively short periods of independent evolution. All of the statistics were affected by the amount of genetic diversity within the populations being compared, significantly complicating the interpretation of genetic distance data. PMID:9409849
Correcting Too Much or Too Little? The Performance of Three Chi-Square Corrections.
Foldnes, Njål; Olsson, Ulf Henning
2015-01-01
This simulation study investigates the performance of three test statistics, T1, T2, and T3, used to evaluate structural equation model fit under non normal data conditions. T1 is the well-known mean-adjusted statistic of Satorra and Bentler. T2 is the mean-and-variance adjusted statistic of Sattertwaithe type where the degrees of freedom is manipulated. T3 is a recently proposed version of T2 that does not manipulate degrees of freedom. Discrepancies between these statistics and their nominal chi-square distribution in terms of errors of Type I and Type II are investigated. All statistics are shown to be sensitive to increasing kurtosis in the data, with Type I error rates often far off the nominal level. Under excess kurtosis true models are generally over-rejected by T1 and under-rejected by T2 and T3, which have similar performance in all conditions. Under misspecification there is a loss of power with increasing kurtosis, especially for T2 and T3. The coefficient of variation of the nonzero eigenvalues of a certain matrix is shown to be a reliable indicator for the adequacy of these statistics.
Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha
2017-01-01
The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Privacy-preserving Kruskal-Wallis test.
Guo, Suxin; Zhong, Sheng; Zhang, Aidong
2013-10-01
Statistical tests are powerful tools for data analysis. Kruskal-Wallis test is a non-parametric statistical test that evaluates whether two or more samples are drawn from the same distribution. It is commonly used in various areas. But sometimes, the use of the method is impeded by privacy issues raised in fields such as biomedical research and clinical data analysis because of the confidential information contained in the data. In this work, we give a privacy-preserving solution for the Kruskal-Wallis test which enables two or more parties to coordinately perform the test on the union of their data without compromising their data privacy. To the best of our knowledge, this is the first work that solves the privacy issues in the use of the Kruskal-Wallis test on distributed data. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Structural texture similarity metrics for image analysis and retrieval.
Zujovic, Jana; Pappas, Thrasyvoulos N; Neuhoff, David L
2013-07-01
We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.
Herbst, Daniel P
2014-09-01
Micropore filters are used during extracorporeal circulation to prevent gaseous and solid particles from entering the patient's systemic circulation. Although these devices improve patient safety, limitations in current designs have prompted the development of a new concept in micropore filtration. A prototype of the new design was made using 40-μm filter screens and compared against four commercially available filters for performance in pressure loss and gross air handling. Pre- and postfilter bubble counts for 5- and 10-mL bolus injections in an ex vivo test circuit were recorded using a Doppler ultrasound bubble counter. Statistical analysis of results for bubble volume reduction between test filters was performed with one-way repeated-measures analysis of variance using Bonferroni post hoc tests. Changes in filter performance with changes in microbubble load were also assessed with dependent t tests using the 5- and 10-mL bolus injections as the paired sample for each filter. Significance was set at p < .05. All filters in the test group were comparable in pressure loss performance, showing a range of 26-33 mmHg at a flow rate of 6 L/min. In gross air-handling studies, the prototype showed improved bubble volume reduction, reaching statistical significance with three of the four commercial filters. All test filters showed decreased performance in bubble volume reduction when the microbubble load was increased. Findings from this research support the underpinning theories of a sequential arterial-line filter design and suggest that improvements in microbubble filtration may be possible using this technique.
Herbst, Daniel P.
2014-01-01
Abstract: Micropore filters are used during extracorporeal circulation to prevent gaseous and solid particles from entering the patient’s systemic circulation. Although these devices improve patient safety, limitations in current designs have prompted the development of a new concept in micropore filtration. A prototype of the new design was made using 40-μm filter screens and compared against four commercially available filters for performance in pressure loss and gross air handling. Pre- and postfilter bubble counts for 5- and 10-mL bolus injections in an ex vivo test circuit were recorded using a Doppler ultrasound bubble counter. Statistical analysis of results for bubble volume reduction between test filters was performed with one-way repeated-measures analysis of variance using Bonferroni post hoc tests. Changes in filter performance with changes in microbubble load were also assessed with dependent t tests using the 5- and 10-mL bolus injections as the paired sample for each filter. Significance was set at p < .05. All filters in the test group were comparable in pressure loss performance, showing a range of 26–33 mmHg at a flow rate of 6 L/min. In gross air-handling studies, the prototype showed improved bubble volume reduction, reaching statistical significance with three of the four commercial filters. All test filters showed decreased performance in bubble volume reduction when the microbubble load was increased. Findings from this research support the underpinning theories of a sequential arterial-line filter design and suggest that improvements in microbubble filtration may be possible using this technique. PMID:26357790
Li, Dongrui; Cheng, Zhigang; Chen, Gang; Liu, Fangyi; Wu, Wenbo; Yu, Jie; Gu, Ying; Liu, Fengyong; Ren, Chao; Liang, Ping
2018-04-03
To test the accuracy and efficacy of the multimodality imaging-compatible insertion robot with a respiratory motion calibration module designed for ablation of liver tumors in phantom and animal models. To evaluate and compare the influences of intervention experience on robot-assisted and ultrasound-controlled ablation procedures. Accuracy tests on rigid body/phantom model with a respiratory movement simulation device and microwave ablation tests on porcine liver tumor/rabbit liver cancer were performed with the robot we designed or with the traditional ultrasound-guidance by physicians with or without intervention experience. In the accuracy tests performed by the physicians without intervention experience, the insertion accuracy and efficiency of robot-assisted group was higher than those of ultrasound-guided group with statistically significant differences. In the microwave ablation tests performed by the physicians without intervention experience, better complete ablation rate was achieved when applying the robot. In the microwave ablation tests performed by the physicians with intervention experience, there was no statistically significant difference of the insertion number and total ablation time between the robot-assisted group and the ultrasound-controlled group. The evaluation by the NASA-TLX suggested that the robot-assisted insertion and microwave ablation process performed by physicians with or without experience were more comfortable. The multimodality imaging-compatible insertion robot with a respiratory motion calibration module designed for ablation of liver tumors could increase the insertion accuracy and ablation efficacy, and minimize the influence of the physicians' experience. The ablation procedure could be more comfortable with less stress with the application of the robot.
Improved Statistics for Genome-Wide Interaction Analysis
Ueki, Masao; Cordell, Heather J.
2012-01-01
Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670
Testing effects of consumer richness, evenness and body size on ecosystem functioning.
Reiss, Julia; Bailey, R A; Perkins, Daniel M; Pluchinotta, Angela; Woodward, Guy
2011-11-01
1. Numerous studies have revealed (usually positive) relationships between biodiversity and ecosystem functioning (B-EF), but the underpinning drivers are rarely addressed explicitly, hindering the development of a more predictive understanding. 2. We developed a suite of statistical models (where we combined existing models with novel ones) to test for richness and evenness effects on detrital processing in freshwater microcosms. Instead of using consumer species as biodiversity units, we used two size classes within three species (six types). This allowed us to test for diversity effects and also to focus on the role of body size and biomass. 3. Our statistical models tested for (i) whether performance in polyculture was more than the sum of its parts (non-additive effects), (ii) the effects of specific type combinations (assemblage identity effects) and (iii) whether types behaved differently when their absolute or relative abundances were altered (e.g. because type abundance in polyculture was lower compared with monoculture). The latter point meant we did not need additional density treatments. 4. Process rates were independent of richness and evenness and all types performed in an additive fashion. The performance of a type was mainly driven by the consumers' metabolic requirements (connected to body size). On an assemblage level, biomass explained a large proportion of detrital processing rates. 5. We conclude that B-EF studies would benefit from widening their statistical approaches. Further, they need to consider biomass of species assemblages and whether biomass is comprised of small or large individuals, because even if all species are present in the same biomass, small species (or individuals) will perform better. © 2011 The Authors. Journal of Animal Ecology © 2011 British Ecological Society.
Aalizadeh, Bahman; Mohammadzadeh, Hassan; Khazani, Ali; Dadras, Ali
2016-01-01
Background: Physical exercises can influence some anthropometric and fitness components differently. The aim of present study was to evaluate how a relatively long-term training program in 11-14-year-old male Iranian students affects their anthropometric and motor performance measures. Methods: Measurements were conducted on the anthropometric and fitness components of participants (n = 28) prior to and following the program. They trained 20 weeks, 1.5 h/session with 10 min rest, in 4 times trampoline training programs per week. Motor performance of all participants was assessed using standing long jump and vertical jump based on Eurofit Test Battery. Results: The analysis of variance (ANOVA) repeated measurement test showed a statistically significant main effect of time in calf girth P = 0.001, fat% P = 0.01, vertical jump P = 0.001, and long jump P = 0.001. The ANOVA repeated measurement test revealed a statistically significant main effect of group in fat% P = 0.001. Post hoc paired t-tests indicated statistical significant differences in trampoline group between the two measurements about calf girth (t = −4.35, P = 0.001), fat% (t = 5.87, P = 0.001), vertical jump (t = −5.53, P = 0.001), and long jump (t = −10.00, P = 0.001). Conclusions: We can conclude that 20-week trampoline training with four physical activity sessions/week in 11–14-year-old students seems to have a significant effect on body fat% reduction and effective results in terms of anaerobic physical fitness. Therefore, it is suggested that different training model approach such as trampoline exercises can help students to promote the level of health and motor performance. PMID:27512557
Aalizadeh, Bahman; Mohammadzadeh, Hassan; Khazani, Ali; Dadras, Ali
2016-01-01
Physical exercises can influence some anthropometric and fitness components differently. The aim of present study was to evaluate how a relatively long-term training program in 11-14-year-old male Iranian students affects their anthropometric and motor performance measures. Measurements were conducted on the anthropometric and fitness components of participants (n = 28) prior to and following the program. They trained 20 weeks, 1.5 h/session with 10 min rest, in 4 times trampoline training programs per week. Motor performance of all participants was assessed using standing long jump and vertical jump based on Eurofit Test Battery. The analysis of variance (ANOVA) repeated measurement test showed a statistically significant main effect of time in calf girth P = 0.001, fat% P = 0.01, vertical jump P = 0.001, and long jump P = 0.001. The ANOVA repeated measurement test revealed a statistically significant main effect of group in fat% P = 0.001. Post hoc paired t-tests indicated statistical significant differences in trampoline group between the two measurements about calf girth (t = -4.35, P = 0.001), fat% (t = 5.87, P = 0.001), vertical jump (t = -5.53, P = 0.001), and long jump (t = -10.00, P = 0.001). We can conclude that 20-week trampoline training with four physical activity sessions/week in 11-14-year-old students seems to have a significant effect on body fat% reduction and effective results in terms of anaerobic physical fitness. Therefore, it is suggested that different training model approach such as trampoline exercises can help students to promote the level of health and motor performance.
Harnessing Multivariate Statistics for Ellipsoidal Data in Structural Geology
NASA Astrophysics Data System (ADS)
Roberts, N.; Davis, J. R.; Titus, S.; Tikoff, B.
2015-12-01
Most structural geology articles do not state significance levels, report confidence intervals, or perform regressions to find trends. This is, in part, because structural data tend to include directions, orientations, ellipsoids, and tensors, which are not treatable by elementary statistics. We describe a full procedural methodology for the statistical treatment of ellipsoidal data. We use a reconstructed dataset of deformed ooids in Maryland from Cloos (1947) to illustrate the process. Normalized ellipsoids have five degrees of freedom and can be represented by a second order tensor. This tensor can be permuted into a five dimensional vector that belongs to a vector space and can be treated with standard multivariate statistics. Cloos made several claims about the distribution of deformation in the South Mountain fold, Maryland, and we reexamine two particular claims using hypothesis testing: 1) octahedral shear strain increases towards the axial plane of the fold; 2) finite strain orientation varies systematically along the trend of the axial trace as it bends with the Appalachian orogen. We then test the null hypothesis that the southern segment of South Mountain is the same as the northern segment. This test illustrates the application of ellipsoidal statistics, which combine both orientation and shape. We report confidence intervals for each test, and graphically display our results with novel plots. This poster illustrates the importance of statistics in structural geology, especially when working with noisy or small datasets.
Examination of Test and Item Statistics from Visual and Verbal Mathematics Questions
ERIC Educational Resources Information Center
Alpayar, Cagla; Gulleroglu, H. Deniz
2017-01-01
The aim of this research is to determine whether students' test performance and approaches to test questions change based on the type of mathematics questions (visual or verbal) administered to them. This research is based on a mixed-design model. The quantitative data are gathered from 297 seventh grade students, attending seven different middle…
NASA Technical Reports Server (NTRS)
Davis, Richard E.; Maddalon, Dal V.; Wagner, Richard D.; Fisher, David F.; Young, Ronald
1989-01-01
Summary evaluations of the performance of laminar-flow control (LFC) leading edge test articles on a NASA JetStar aircraft are presented. Statistics, presented for the test articles' performance in haze and cloud situations, as well as in clear air, show a significant effect of cloud particle concentrations on the extent of laminar flow. The cloud particle environment was monitored by two instruments, a cloud particle spectrometer (Knollenberg probe) and a charging patch. Both instruments are evaluated as diagnostic aids for avoiding laminar-flow detrimental particle concentrations in future LFC aircraft operations. The data base covers 19 flights in the simulated airline service phase of the NASA Leading-Edge Flight-Test (LEFT) Program.
Dexterity testing of chemical-defense gloves. Technical report
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robinette, K.M.; Ervin; Zehner, G.F.
1986-05-01
Chemical-defense gloves (12.5-mil Epichlorohydron/Butyl, 14-mil Epichlorohydron/Butyl, and 7-mil Butyl with Nomex overgloves) were subjected to four dexterity tests (O'Connor Finger Dexterity Test, Pennsylvania Bi-Manual Worksample-Assembly, Minnesota Rate of Manipulation Turning, and the Crawford Small Test). Results indicated that subjects performances were most impaired by the 7-mil Butyl with Nomex overglove. Though differences between the other three gloved conditions were not always statistically significant, subjects performed silghtly better while wearing the Epichlorohydron/Butyl gloves, no matter which thickness, than they did while wearing the 15-mil butyl gloves. High negative correlation between anthropometry and gloved tests scores of subjects suggested that poor glovemore » fit may also have affected subjects performances.« less
Further statistics in dentistry, Part 5: Diagnostic tests for oral conditions.
Petrie, A; Bulman, J S; Osborn, J F
2002-12-07
A diagnostic test is a simple test, sometimes based on a clinical measurement, which is used when the gold-standard test providing a definitive diagnosis of a given condition is too expensive, invasive or time-consuming to perform. The diagnostic test can be used to diagnose a dental condition in an individual patient or as a screening device in a population of apparently healthy individuals.
Communication skills in individuals with spastic diplegia.
Lamônica, Dionísia Aparecida Cusin; Paiva, Cora Sofia Takaya; Abramides, Dagma Venturini Marques; Biazon, Jamile Lozano
2015-01-01
To assess communication skills in children with spastic diplegia. The study included 20 subjects, 10 preschool children with spastic diplegia and 10 typical matched according to gender, mental age, and socioeconomic status. Assessment procedures were the following: interviews with parents, Stanford - Binet method, Gross Motor Function Classification System, Observing the Communicative Behavior, Vocabulary Test by Peabody Picture, Denver Developmental Screening Test II, MacArthur Development Inventory on Communicative Skills. Statistical analysis was performed using the values of mean, median, minimum and maximum value, and using Student's t-test, Mann-Whitney test, and Paired t-test. Individuals with spastic diplegia, when compared to their peers of the same mental age, presented no significant difference in relation to receptive and expressive vocabulary, fine motor skills, adaptive, personal-social, and language. The most affected area was the gross motor skills in individuals with spastic cerebral palsy. The participation in intervention procedures and the pairing of participants according to mental age may have approximated the performance between groups. There was no statistically significant difference in the comparison between groups, showing appropriate communication skills, although the experimental group has not behaved homogeneously.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Ng, M L; Warlow, R S; Chrishanthan, N; Ellis, C; Walls, R
2000-09-01
The aim of this study is to formulate criteria for the definition of allergic rhinitis. Other studies have sought to develop scoring systems to categorize the severity of allergic rhinitis symptoms but it was never used for the formulation of diagnostic criteria. These other scoring systems were arbitrarily chosen and were not derived by any statistical analysis. To date, a study of this kind has not been performed. The hypothesis of this study is that it is possible to formulate criteria for the definition of allergic rhinitis. This is the first study to systematically examine and evaluate the relative importance of symptoms, signs and investigative tests in allergic rhinitis. We sought to statistically rank, from the most to the least important, the multiplicity of symptoms, signs and test results. Forty-seven allergic rhinitis and 23 normal subjects were evaluated with a detailed questionnaire and history, physical examination, serum total immunoglobulin E, skin prick tests and serum enzyme allergosorbent tests (EAST). Statistical ranking of variables indicated rhinitis symptoms (nasal, ocular and oronasal) were the most commonly occurring, followed by a history of allergen provocation, then serum total IgE, positive skin prick tests and positive EAST's to house dust mite, perennial rye and bermuda/couch grass. Throat symptoms ranked even lower whilst EAST's to cat epithelia, plantain and cockroach were the least important. Not all symptoms, signs and tests evaluated proved to be statistically significant when compared to a control group; this included symtoms and signs which had been considered historically to be traditionally associated with allergic rhinitis, e.g. sore throat and bleeding nose. In performing statistical analyses, we were able to rank from most to least important, the multiplicity of symptoms signs and test results. The most important symptoms and signs were identified for the first time, even though some of these were not included in our original selection criteria for defining the disease cohort i.e. sniffing, postnasal drip, oedematous nasal mucosa, impaired sense of smell, mouth breathing, itchy nose and many of the specific provocation factors.
An explorative study of school performance and antipsychotic medication.
van der Schans, J; Vardar, S; Çiçek, R; Bos, H J; Hoekstra, P J; de Vries, T W; Hak, E
2016-09-21
Antipsychotic therapy can reduce severe symptoms of psychiatric disorders, however, data on school performance among children on such treatment are lacking. The objective was to explore school performance among children using antipsychotic drugs at the end of primary education. A cross-sectional study was conducted using the University Groningen pharmacy database linked to academic achievement scores at the end of primary school (Dutch Cito-test) obtained from Statistics Netherlands. Mean Cito-test scores and standard deviations were obtained for children on antipsychotic therapy and reference children, and statistically compared using analyses of covariance. In addition, differences in subgroups as boys versus girls, ethnicity, household income, and late starters (start date within 12 months of the Cito-test) versus early starters (start date > 12 months before the Cito-test) were tested. In all, data from 7994 children could be linked to Cito-test scores. At the time of the Cito-test, 45 (0.6 %) were on treatment with antipsychotics. Children using antipsychotics scored on average 3.6 points lower than the reference peer group (534.5 ± 9.5). Scores were different across gender and levels of household income (p < 0.05). Scores of early starters were significantly higher than starters within 12 months (533.7 ± 1.7 vs. 524.1 ± 2.6). This first exploration showed that children on antipsychotic treatment have lower school performance compared to the reference peer group at the end of primary school. This was most noticeable for girls, but early starters were less affected than later starters. Due to the observational cross-sectional nature of this study, no causality can be inferred, but the results indicate that school performance should be closely monitored and causes of underperformance despite treatment warrants more research.
Evaluation program for secondary spacecraft cells
NASA Technical Reports Server (NTRS)
Christy, D. E.
1972-01-01
The life cycle test of secondary spacecraft electric cells is discussed. The purpose of the tests is to insure that all cells put into the life cycle test meet the required specifications. The evaluation program gathers statistical information concerning cell performance characteristics and limitations. Weaknesses in cell design which are discovered during the tests are reported to research facilities in order to increase the service life of the cells.
NASA Astrophysics Data System (ADS)
Skorobogatiy, Maksim; Sadasivan, Jayesh; Guerboukha, Hichem
2018-05-01
In this paper, we first discuss the main types of noise in a typical pump-probe system, and then focus specifically on terahertz time domain spectroscopy (THz-TDS) setups. We then introduce four statistical models for the noisy pulses obtained in such systems, and detail rigorous mathematical algorithms to de-noise such traces, find the proper averages and characterise various types of experimental noise. Finally, we perform a comparative analysis of the performance, advantages and limitations of the algorithms by testing them on the experimental data collected using a particular THz-TDS system available in our laboratories. We conclude that using advanced statistical models for trace averaging results in the fitting errors that are significantly smaller than those obtained when only a simple statistical average is used.
Assessment of surface hardness of acrylic resins submitted to accelerated artificial aging.
Tornavoi, D C; Agnelli, J A M; Lepri, C P; Mazzetto, M O; Botelho, A L; Soares, R G; Dos Reis, A C
2012-06-01
The aim of this study was to assess the influence of accelerated artificial aging (AAA) on the surface hardness of acrylic resins. The following three commercial brands of acrylic resins were tested: Vipi Flash (autopolymerized resin), Vipi Wave (microwave heat-polymerized resin) and Vipi Cril (conventional heat-polymerized resin). To perform the tests, 21 test specimens (65x10x3 mm) were made, 7 for each resin. Three surface hardness readings were performed for each test specimen, before and after AAA, and the means were submitted to the following tests: Kolmogorov-Smirnov (P>0.05), Levene Statistic, Two-way ANOVA, Tukey Post Hoc (P<0.05) with the SPSS Statistical Software 17.0. The analysis of the factors showed significant differences in the hardness values (P<0.05). Before aging, the autopolymerized acrylic resin Vipi Flash showed lower hardness values when compared with the heat-polymerized resin Vipi Cril (P=0.001). After aging, the 3 materials showed similar performance when compared among them. The Vipi Cril was the only one affected by AAA and showed lower hardness values after this procedure (Pp=0.003). It may be concluded that accelerated artificial aging influenced surface hardness of heat-polymerized acrylic resin Vipi Cril.
Non-Asbestos Insulation Testing Using a Plasma Torch
NASA Technical Reports Server (NTRS)
Morgan, R. E.; Prince, A. S.; Selvidge, S. A.; Phelps, J.; Martin, C. L.; Lawrence, T. W.
2000-01-01
Insulation obsolescence issues are a major concern for the Reusable Solid Rocket Motor (RSRM). As old sources of raw materials disappear, new sources must be found and qualified. No simple, inexpensive test presently exists for predicting the erosion performance of a candidate insulation in the full-scale motor, Large motor tests cost million of dollars and therefore can only be used on a few very select candidates. There is a need for a simple, low cost method of screening insulation performance that can simulate some of the different erosion environments found in the RSRM. This paper describes a series of erosion tests on two different non-asbestos insulation formulations, a KEVLAR(registered) fiber-filled and a carbon fiber-filled insulation containing Ethylene-Propylene-Diene Monomer (EPDM) rubber as the binder. The test instrument was a plasma torch device. The two main variables investigated were heat flux and alumina particle impingement concentration. Statistical analysis revealed that the two different formulations had very different responses to the main variable. The results of this work indicate that there may be fundamental differences in how these insulation formulations perform in the motor operating environment. The plasma torch appears to offer a low-cost means of obtaining a fundamental understanding of insulation response to critical factors in a series of statistically designed experiments.
ERIC Educational Resources Information Center
Awang-Hashim, Rosa; O'Neil, Harold F., Jr.; Hocevar, Dennis
2002-01-01
The relations between motivational constructs, effort, self-efficacy and worry, and statistics achievement were investigated in a sample of 360 undergraduates in Malaysia. Both trait (cross-situational) and state (task-specific) measures of each construct were used to test a mediational trait (r) state (r) performance (TSP) model. As hypothesized,…
ERIC Educational Resources Information Center
Gadway, Charles J.; Wilson, H.A.
This document provides statistical data on the 1974 and 1975 Mini-Assessment of Functional Literacy, which was designed to determine the extent of functional literacy among seventeen year olds in America. Also presented are data from comparable test items from the 1971 assessment. Three standards are presented, to allow different methods of…
Effect of Table Tennis Trainings on Biomotor Capacities in Boys
ERIC Educational Resources Information Center
Tas, Murat
2017-01-01
The aim of this study is to investigate whether the biomotor capacities of boys doing table tennis trainings are affected. A total of 40 students, as randomly selected 20 test groups and 20 control groups at an age range of 10-12 participated in the research. Statistical analysis of data was performed using Statistic Package for Social Science…
ERIC Educational Resources Information Center
Jeske, Debora; Roßnagell, Christian Stamov; Backhaus, Joy
2014-01-01
We examined the role of learner characteristics as predictors of four aspects of e-learning performance, including knowledge test performance, learning confidence, learning efficiency, and navigational effectiveness. We used both self reports and log file records to compute the relevant statistics. Regression analyses showed that both need for…
Correlations among Jamaican 12th-Graders' Five Variables and Performance in Genetics
ERIC Educational Resources Information Center
Bloomfield, Deen-Paul; Soyibo, Kola
2008-01-01
This study was aimed at finding out if the level of performance of selected Jamaican Grade 12 students on an achievement test on the concept of genetics was satisfactory; if there were statistically significant differences in their performance on the concept linked to their gender, self-esteem, cognitive abilities in biology, school-type and…
Design and Test of Pseudorandom Number Generator Using a Star Network of Lorenz Oscillators
NASA Astrophysics Data System (ADS)
Cho, Kenichiro; Miyano, Takaya
We have recently developed a chaos-based stream cipher based on augmented Lorenz equations as a star network of Lorenz subsystems. In our method, the augmented Lorenz equations are used as a pseudorandom number generator. In this study, we propose a new method based on the augmented Lorenz equations for generating binary pseudorandom numbers and evaluate its security using the statistical tests of SP800-22 published by the National Institute for Standards and Technology in comparison with the performances of other chaotic dynamical models used as binary pseudorandom number generators. We further propose a faster version of the proposed method and evaluate its security using the statistical tests of TestU01 published by L’Ecuyer and Simard.
SIRU utilization. Volume 1: Theory, development and test evaluation
NASA Technical Reports Server (NTRS)
Musoff, H.
1974-01-01
The theory, development, and test evaluations of the Strapdown Inertial Reference Unit (SIRU) are discussed. The statistical failure detection and isolation, single position calibration, and self alignment techniques are emphasized. Circuit diagrams of the system components are provided. Mathematical models are developed to show the performance characteristics of the subsystems. Specific areas of the utilization program are identified as: (1) error source propagation characteristics and (2) local level navigation performance demonstrations.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Generalized functional linear models for gene-based case-control association studies.
Fan, Ruzong; Wang, Yifan; Mills, James L; Carter, Tonia C; Lobach, Iryna; Wilson, Alexander F; Bailey-Wilson, Joan E; Weeks, Daniel E; Xiong, Momiao
2014-11-01
By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. © 2014 WILEY PERIODICALS, INC.
Generalized Functional Linear Models for Gene-based Case-Control Association Studies
Mills, James L.; Carter, Tonia C.; Lobach, Iryna; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Weeks, Daniel E.; Xiong, Momiao
2014-01-01
By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT-O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT-O. In practice, it is not known whether rare variants or common variants in a gene are disease-related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT-O on real neural tube defects and Hirschsprung's disease data sets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT-O in the real data analysis. Our methods can be used in either gene-disease genome-wide/exome-wide association studies or candidate gene analyses. PMID:25203683
Kocer, Naci; Mondel, Prabath Kumar; Yamac, Elif; Kavak, Ayse; Kizilkilic, Osman; Islak, Civan
2017-11-01
Flow diverters are increasingly used in the treatment of complex and giant intracranial aneurysms. However, they are associated with complications like late aneurysmal rupture. Additionally, flow diverters show focal structural decrease in luminal diameter without any intimal hyperplasia. This resembles a "fish mouth" when viewed en face. In this pilot study, we tested the hypothesis of a possible association between flow diverter fish-mouthing and delayed-type hypersensitivity to its metal constituents. We retrospectively reviewed patient records from our center between May 2010 and November 2015. A total of nine patients had flow diverter fish mouthing. A control group of 25 patients was selected. All study participants underwent prospective patch test to detect hypersensitivity to flow diverter metal constituents. Analysis was performed using logistic regression analysis and Wilcoxon sign rank sum test. Univariate and multivariate analyses were performed to test variables to predict flow diverter fish mouthing. The association between flow diverter fish mouthing and positive patch test was not statistically significant. In multivariate analysis, history of allergy and maximum aneurysm size category was associated with flow diverter fish mouthing. This was further confirmed on Wilcoxon sign rank sum test. The study showed statistically significant association between flow diverter fish mouthing and history of contact allergy and a small aneurysmal size. Further large-scale studies are needed to detect a statistically significant association between flow diverter fish mouthing and patch test. We recommend early and more frequent follow-up imaging in patients with contact allergy to detect flow diverter fish mouthing and its subsequent evolution.
Equivalence Testing of Complex Particle Size Distribution Profiles Based on Earth Mover's Distance.
Hu, Meng; Jiang, Xiaohui; Absar, Mohammad; Choi, Stephanie; Kozak, Darby; Shen, Meiyu; Weng, Yu-Ting; Zhao, Liang; Lionberger, Robert
2018-04-12
Particle size distribution (PSD) is an important property of particulates in drug products. In the evaluation of generic drug products formulated as suspensions, emulsions, and liposomes, the PSD comparisons between a test product and the branded product can provide useful information regarding in vitro and in vivo performance. Historically, the FDA has recommended the population bioequivalence (PBE) statistical approach to compare the PSD descriptors D50 and SPAN from test and reference products to support product equivalence. In this study, the earth mover's distance (EMD) is proposed as a new metric for comparing PSD particularly when the PSD profile exhibits complex distribution (e.g., multiple peaks) that is not accurately described by the D50 and SPAN descriptor. EMD is a statistical metric that measures the discrepancy (distance) between size distribution profiles without a prior assumption of the distribution. PBE is then adopted to perform statistical test to establish equivalence based on the calculated EMD distances. Simulations show that proposed EMD-based approach is effective in comparing test and reference profiles for equivalence testing and is superior compared to commonly used distance measures, e.g., Euclidean and Kolmogorov-Smirnov distances. The proposed approach was demonstrated by evaluating equivalence of cyclosporine ophthalmic emulsion PSDs that were manufactured under different conditions. Our results show that proposed approach can effectively pass an equivalent product (e.g., reference product against itself) and reject an inequivalent product (e.g., reference product against negative control), thus suggesting its usefulness in supporting bioequivalence determination of a test product to the reference product which both possess multimodal PSDs.
An automated system for chromosome analysis. Volume 1: Goals, system design, and performance
NASA Technical Reports Server (NTRS)
Castleman, K. R.; Melnyk, J. H.
1975-01-01
The design, construction, and testing of a complete system to produce karyotypes and chromosome measurement data from human blood samples, and a basis for statistical analysis of quantitative chromosome measurement data is described. The prototype was assembled, tested, and evaluated on clinical material and thoroughly documented.
Using VITA Service Learning Experiences to Teach Hypothesis Testing and P-Value Analysis
ERIC Educational Resources Information Center
Drougas, Anne; Harrington, Steve
2011-01-01
This paper describes a hypothesis testing project designed to capture student interest and stimulate classroom interaction and communication. Using an online survey instrument, the authors collected student demographic information and data regarding university service learning experiences. Introductory statistics students performed a series of…
[Evaluation of using statistical methods in selected national medical journals].
Sych, Z
1996-01-01
The paper covers the performed evaluation of frequency with which the statistical methods were applied in analyzed works having been published in six selected, national medical journals in the years 1988-1992. For analysis the following journals were chosen, namely: Klinika Oczna, Medycyna Pracy, Pediatria Polska, Polski Tygodnik Lekarski, Roczniki Państwowego Zakładu Higieny, Zdrowie Publiczne. Appropriate number of works up to the average in the remaining medical journals was randomly selected from respective volumes of Pol. Tyg. Lek. The studies did not include works wherein the statistical analysis was not implemented, which referred both to national and international publications. That exemption was also extended to review papers, casuistic ones, reviews of books, handbooks, monographies, reports from scientific congresses, as well as papers on historical topics. The number of works was defined in each volume. Next, analysis was performed to establish the mode of finding out a suitable sample in respective studies, differentiating two categories: random and target selections. Attention was also paid to the presence of control sample in the individual works. In the analysis attention was also focussed on the existence of sample characteristics, setting up three categories: complete, partial and lacking. In evaluating the analyzed works an effort was made to present the results of studies in tables and figures (Tab. 1, 3). Analysis was accomplished with regard to the rate of employing statistical methods in analyzed works in relevant volumes of six selected, national medical journals for the years 1988-1992, simultaneously determining the number of works, in which no statistical methods were used. Concurrently the frequency of applying the individual statistical methods was analyzed in the scrutinized works. Prominence was given to fundamental statistical methods in the field of descriptive statistics (measures of position, measures of dispersion) as well as most important methods of mathematical statistics such as parametric tests of significance, analysis of variance (in single and dual classifications). non-parametric tests of significance, correlation and regression. The works, in which use was made of either multiple correlation or multiple regression or else more complex methods of studying the relationship for two or more numbers of variables, were incorporated into the works whose statistical methods were constituted by correlation and regression as well as other methods, e.g. statistical methods being used in epidemiology (coefficients of incidence and morbidity, standardization of coefficients, survival tables) factor analysis conducted by Jacobi-Hotellng's method, taxonomic methods and others. On the basis of the performed studies it has been established that the frequency of employing statistical methods in the six selected national, medical journals in the years 1988-1992 was 61.1-66.0% of the analyzed works (Tab. 3), and they generally were almost similar to the frequency provided in English language medical journals. On a whole, no significant differences were disclosed in the frequency of applied statistical methods (Tab. 4) as well as in frequency of random tests (Tab. 3) in the analyzed works, appearing in the medical journals in respective years 1988-1992. The most frequently used statistical methods in analyzed works for 1988-1992 were the measures of position 44.2-55.6% and measures of dispersion 32.5-38.5% as well as parametric tests of significance 26.3-33.1% of the works analyzed (Tab. 4). For the purpose of increasing the frequency and reliability of the used statistical methods, the didactics should be widened in the field of biostatistics at medical studies and postgraduation training designed for physicians and scientific-didactic workers.
Definition of osteoarthritis on MRI: results of a Delphi exercise.
Hunter, D J; Arden, N; Conaghan, P G; Eckstein, F; Gold, G; Grainger, A; Guermazi, A; Harvey, W; Jones, G; Hellio Le Graverand, M P; Laredo, J D; Lo, G; Losina, E; Mosher, T J; Roemer, F; Zhang, W
2011-08-01
Despite a growing body of Magnetic Resonance Imaging (MRI) literature in osteoarthritis (OA), there is little uniformity in its diagnostic application. We envisage in the first instance the definition requiring further validation and testing in the research setting before considering implementation/feasibility testing in the clinical setting. The objective of our research was to develop an MRI definition of structural OA. We undertook a multistage process consisting of a number of different steps. The intent was to develop testable definitions of OA (knee, hip and/or hand) on MRI. This was an evidence driven approach with results of a systematic review provided to the group prior to a Delphi exercise. Each participant of the steering group was allowed to submit independently up to five propositions related to key aspects in MRI diagnosis of knee OA. The steering group then participated in a Delphi exercise to reach consensus on which propositions we would recommend for a definition of structural OA on MRI. For each round of voting, ≥60% votes led to include and ≤20% votes led to exclude a proposition. After developing the proposition one of the definitions developed was tested for its validity against radiographic OA in an extant database. For the systematic review we identified 25 studies which met all of our inclusion criteria and contained relevant diagnostic measure and performance data. At the completion of the Delphi voting exercise 11 propositions were accepted for definition of structural OA on MRI. We assessed the diagnostic performance of the tibiofemoral MRI definition against a radiographic reference standard. The diagnostic performance for individual features was: osteophyte C statistic=0.61, for cartilage loss C statistic=0.73, for bone marrow lesions C statistic=0.72 and for meniscus tear in any region C statistic=0.78. The overall composite model for these four features was a C statistic=0.59. We detected good specificity (1) but less optimal sensitivity (0.46) likely due to detection of disease earlier on MRI. We have developed MRI definition of knee OA that requires further formal testing with regards their diagnostic performance (especially in datasets of persons with early disease), before they are more widely used. Our current analysis suggests that further testing should focus on comparisons other than the radiograph, that may capture later stage disease and thus nullify the potential for detecting early disease that MRI may afford. The propositions are not to detract from, nor to discourage the use of traditional means of diagnosing OA. Copyright © 2011 Osteoarthritis Research Society International. All rights reserved.
Using Relative Statistics and Approximate Disease Prevalence to Compare Screening Tests.
Samuelson, Frank; Abbey, Craig
2016-11-01
Schatzkin et al. and other authors demonstrated that the ratios of some conditional statistics such as the true positive fraction are equal to the ratios of unconditional statistics, such as disease detection rates, and therefore we can calculate these ratios between two screening tests on the same population even if negative test patients are not followed with a reference procedure and the true and false negative rates are unknown. We demonstrate that this same property applies to an expected utility metric. We also demonstrate that while simple estimates of relative specificities and relative areas under ROC curves (AUC) do depend on the unknown negative rates, we can write these ratios in terms of disease prevalence, and the dependence of these ratios on a posited prevalence is often weak particularly if that prevalence is small or the performance of the two screening tests is similar. Therefore we can estimate relative specificity or AUC with little loss of accuracy, if we use an approximate value of disease prevalence.
On use of the multistage dose-response model for assessing laboratory animal carcinogenicity
Nitcheva, Daniella; Piegorsch, Walter W.; West, R. Webster
2007-01-01
We explore how well a statistical multistage model describes dose-response patterns in laboratory animal carcinogenicity experiments from a large database of quantal response data. The data are collected from the U.S. EPA’s publicly available IRIS data warehouse and examined statistically to determine how often higher-order values in the multistage predictor yield significant improvements in explanatory power over lower-order values. Our results suggest that the addition of a second-order parameter to the model only improves the fit about 20% of the time, while adding even higher-order terms apparently does not contribute to the fit at all, at least with the study designs we captured in the IRIS database. Also included is an examination of statistical tests for assessing significance of higher-order terms in a multistage dose-response model. It is noted that bootstrap testing methodology appears to offer greater stability for performing the hypothesis tests than a more-common, but possibly unstable, “Wald” test. PMID:17490794
Meier, Frederick A; Souers, Rhona J; Howanitz, Peter J; Tworek, Joseph A; Perrotta, Peter L; Nakhleh, Raouf E; Karcher, Donald S; Bashleben, Christine; Darcy, Teresa P; Schifman, Ron B; Jones, Bruce A
2015-06-01
Many production systems employ standardized statistical monitors that measure defect rates and cycle times, as indices of performance quality. Clinical laboratory testing, a system that produces test results, is amenable to such monitoring. To demonstrate patterns in clinical laboratory testing defect rates and cycle time using 7 College of American Pathologists Q-Tracks program monitors. Subscribers measured monthly rates of outpatient order-entry errors, identification band defects, and specimen rejections; median troponin order-to-report cycle times and rates of STAT test receipt-to-report turnaround time outliers; and critical values reporting event defects, and corrected reports. From these submissions Q-Tracks program staff produced quarterly and annual reports. These charted each subscriber's performance relative to other participating laboratories and aggregate and subgroup performance over time, dividing participants into best and median performers and performers with the most room to improve. Each monitor's patterns of change present percentile distributions of subscribers' performance in relation to monitoring durations and numbers of participating subscribers. Changes over time in defect frequencies and the cycle duration quantify effects on performance of monitor participation. All monitors showed significant decreases in defect rates as the 7 monitors ran variously for 6, 6, 7, 11, 12, 13, and 13 years. The most striking decreases occurred among performers who initially had the most room to improve and among subscribers who participated the longest. All 7 monitors registered significant improvement. Participation effects improved between 0.85% and 5.1% per quarter of participation. Using statistical quality measures, collecting data monthly, and receiving reports quarterly and yearly, subscribers to a comparative monitoring program documented significant decreases in defect rates and shortening of a cycle time for 6 to 13 years in all 7 ongoing clinical laboratory quality monitors.
Pathway analysis with next-generation sequencing data.
Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao
2015-04-01
Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
de Sá, Joceline Cássia Ferezini; Marini, Gabriela; Gelaleti, Rafael Bottaro; da Silva, João Batista; de Azevedo, George Gantas; Rudge, Marilza Vieira Cunha
2013-11-01
To evaluate the methodological and statistical design evolution of the publications in the Brazilian Journal of Gynecology and Obstetrics (RBGO) from resolution 196/96. A review of 133 articles published in 1999 (65) and 2009 (68) was performed by two independent reviewers with training in clinical epidemiology and methodology of scientific research. We included all original clinical articles, case and series reports and excluded editorials, letters to the editor, systematic reviews, experimental studies, opinion articles, besides abstracts of theses and dissertations. Characteristics related to the methodological quality of the studies were analyzed in each article using a checklist that evaluated two criteria: methodological aspects and statistical procedures. We used descriptive statistics and the χ2 test for comparison of the two years. There was a difference between 1999 and 2009 regarding the study and statistical design, with more accuracy in the procedures and the use of more robust tests between 1999 and 2009. In RBGO, we observed an evolution in the methods of published articles and a more in-depth use of the statistical analyses, with more sophisticated tests such as regression and multilevel analyses, which are essential techniques for the knowledge and planning of health interventions, leading to fewer interpretation errors.
Mechanical Impact Testing: A Statistical Measurement
NASA Technical Reports Server (NTRS)
Engel, Carl D.; Herald, Stephen D.; Davis, S. Eddie
2005-01-01
In the decades since the 1950s, when NASA first developed mechanical impact testing of materials, researchers have continued efforts to gain a better understanding of the chemical, mechanical, and thermodynamic nature of the phenomenon. The impact mechanism is a real combustion ignition mechanism that needs understanding in the design of an oxygen system. The use of test data from this test method has been questioned due to lack of a clear method of application of the data and variability found between tests, material batches, and facilities. This effort explores a large database that has accumulated over a number of years and explores its overall nature. Moreover, testing was performed to determine the statistical nature of the test procedure to help establish sample size guidelines for material characterization. The current method of determining a pass/fail criterion based on either light emission or sound report or material charring is questioned.
Information Input and Performance in Small Decision Making Groups.
ERIC Educational Resources Information Center
Ryland, Edwin Holman
It was hypothesized that increases in the amount and specificity of information furnished to a discussion group would facilitate group decision making and improve other aspects of group and individual performance. Procedures in testing these assumptions included varying the amounts of statistics, examples, testimony, and augmented information…
ERIC Educational Resources Information Center
Stoneberg, Bert D.
2016-01-01
Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests (ISAT). ISAT results have been have been reported almost exclusively as "percent proficient" statistics (i.e., the percentage of Idaho students who performed at the "A" level…
ERIC Educational Resources Information Center
Stoneberg, Bert D.
2018-01-01
Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests. ISAT results have been reported almost exclusively as "percent proficient or above" statistics (i.e., the percentage of Idaho students who performed at the "A" level). This…
ERIC Educational Resources Information Center
Lix, Lisa M.; And Others
1996-01-01
Meta-analytic techniques were used to summarize the statistical robustness literature on Type I error properties of alternatives to the one-way analysis of variance "F" test. The James (1951) and Welch (1951) tests performed best under violations of the variance homogeneity assumption, although their use is not always appropriate. (SLD)
Alles, Susan; Peng, Linda X; Mozola, Mark A
2009-01-01
A modification to Performance-Tested Method 010403, GeneQuence Listeria Test (DNAH method), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C, and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there were statistically significant differences in method performance between the DNAH method and reference culture procedures for only 2 foods (pasteurized crab meat and lettuce) at the 27 h enrichment time point and for only a single food (pasteurized crab meat) in one trial at the 30 h enrichment time point. Independent laboratory testing with 3 foods showed statistical equivalence between the methods for all foods, and results support the findings of the internal trials. Overall, considering both internal and independent laboratory trials, sensitivity of the DNAH method relative to the reference culture procedures was 90.5%. Results of testing 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the DNAH method was more productive than the reference U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS) culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the DNAH method at the 24 h time point. Overall, sensitivity of the DNAH method at 24 h relative to that of the USDA-FSIS method was 152%. The DNAH method exhibited extremely high specificity, with only 1% false-positive reactions overall.
Somnier, F E; Ostergaard, M S; Boysen, G; Bruhn, P; Mikkelsen, B O
1990-01-01
In order to examine if the nootropic drug, aniracetam, was capable of improving cognitive performance, 44 subjects suffering from chronic psychosyndrome after long-term exposure to organic solvents were included in a randomized, double-blind, placebo-controlled, cross-over study. The treatment periods were 3 months with aniracetam 1 g daily and 3 months with placebo. Neuropsychological tests as well as a physical and neurological examination were performed at entry into the study and after each treatment period, together with an evaluation of the subjects' overall condition. Neither the doctors' nor the subjects' own assessment of the overall condition indicated that the trial medication had had any effect. No significant changes in neuropsychological symptoms were observed. A statistically significant difference in favour of antiracetam was found in only 1 of the 19 neuropsychological test measures, namely a test for constructional ability. However, in another test on visuo-spatial function, a statistically significant result was found in favour of placebo. Thus, aniracetam was found to be ineffective in the treatment of subjects suffering from chronic psychosyndrome after long-term exposure to organic solvents.
Statistics for X-chromosome associations.
Özbek, Umut; Lin, Hui-Min; Lin, Yan; Weeks, Daniel E; Chen, Wei; Shaffer, John R; Purcell, Shaun M; Feingold, Eleanor
2018-06-13
In a genome-wide association study (GWAS), association between genotype and phenotype at autosomal loci is generally tested by regression models. However, X-chromosome data are often excluded from published analyses of autosomes because of the difference between males and females in number of X chromosomes. Failure to analyze X-chromosome data at all is obviously less than ideal, and can lead to missed discoveries. Even when X-chromosome data are included, they are often analyzed with suboptimal statistics. Several mathematically sensible statistics for X-chromosome association have been proposed. The optimality of these statistics, however, is based on very specific simple genetic models. In addition, while previous simulation studies of these statistics have been informative, they have focused on single-marker tests and have not considered the types of error that occur even under the null hypothesis when the entire X chromosome is scanned. In this study, we comprehensively tested several X-chromosome association statistics using simulation studies that include the entire chromosome. We also considered a wide range of trait models for sex differences and phenotypic effects of X inactivation. We found that models that do not incorporate a sex effect can have large type I error in some cases. We also found that many of the best statistics perform well even when there are modest deviations, such as trait variance differences between the sexes or small sex differences in allele frequencies, from assumptions. © 2018 WILEY PERIODICALS, INC.
Pharmacy Students' Test-Taking Motivation-Effort on a Low-Stakes Standardized Test
2011-01-01
Objective To measure third-year pharmacy students' level of motivation while completing the Pharmacy Curriculum Outcomes Assessment (PCOA) administered as a low-stakes test to better understand use of the PCOA as a measure of student content knowledge. Methods Student motivation was manipulated through an incentive (ie, personal letter from the dean) and a process of statistical motivation filtering. Data were analyzed to determine any differences between the experimental and control groups in PCOA test performance, motivation to perform well, and test performance after filtering for low motivation-effort. Results Incentivizing students diminished the need for filtering PCOA scores for low effort. Where filtering was used, performance scores improved, providing a more realistic measure of aggregate student performance. Conclusions To ensure that PCOA scores are an accurate reflection of student knowledge, incentivizing and/or filtering for low motivation-effort among pharmacy students should be considered fundamental best practice when the PCOA is administered as a low-stakes test PMID:21655395
Melody and pitch processing in five musical savants with congenital blindness.
Pring, Linda; Woolf, Katherine; Tadic, Valerie
2008-01-01
We examined absolute-pitch (AP) and short-term musical memory abilities of five musical savants with congenital blindness, seven musicians, and seven non-musicians with good vision and normal intelligence in two experiments. In the first, short-term memory for musical phrases was tested and the savants and musicians performed statistically indistinguishably, both significantly outperforming the non-musicians and remembering more material from the C major scale sequences than random trials. In the second experiment, participants learnt associations between four pitches and four objects using a non-verbal paradigm. This experiment approximates to testing AP ability. Low statistical power meant the savants were not statistically better than the musicians, although only the savants scored statistically higher than the non-musicians. The results are evidence for a musical module, separate from general intelligence; they also support the anecdotal reporting of AP in musical savants, which is thought to be necessary for the development of musical-savant skill.
An astronomer's guide to period searching
NASA Astrophysics Data System (ADS)
Schwarzenberg-Czerny, A.
2003-03-01
We concentrate on analysis of unevenly sampled time series, interrupted by periodic gaps, as often encountered in astronomy. While some of our conclusions may appear surprising, all are based on classical statistical principles of Fisher & successors. Except for discussion of the resolution issues, it is best for the reader to forget temporarily about Fourier transforms and to concentrate on problems of fitting of a time series with a model curve. According to their statistical content we divide the issues into several sections, consisting of: (ii) statistical numerical aspects of model fitting, (iii) evaluation of fitted models as hypotheses testing, (iv) the role of the orthogonal models in signal detection (v) conditions for equivalence of periodograms (vi) rating sensitivity by test power. An experienced observer working with individual objects would benefit little from formalized statistical approach. However, we demonstrate the usefulness of this approach in evaluation of performance of periodograms and in quantitative design of large variability surveys.
NASA Astrophysics Data System (ADS)
Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.
2005-05-01
A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Helicopter Acoustic Flight Test with Altitude Variation and Maneuvers
NASA Technical Reports Server (NTRS)
Watts, Michael E.; Greenwood, Eric; Sim, Ben; Stephenson, James; Smith, Charles D.
2016-01-01
A cooperative flight test campaign between NASA and the U.S. Army was performed from September 2014 to February 2015. The purposes of the testing were to: investigate the effects of altitude variation on noise generation, investigate the effects of gross weight variation on noise generation, establish the statistical variability in acoustic flight testing of helicopters, and characterize the effects of transient maneuvers on radiated noise for a medium-lift utility helicopter. This test was performed at three test sites (0, 4000, and 7000 feet above mean sea level) with two aircraft (AS350 SD1 and EH-60L) tested at each site. This report provides an overview of the test, documents the data acquired and describes the formats of the stored data.
Tree-space statistics and approximations for large-scale analysis of anatomical trees.
Feragen, Aasa; Owen, Megan; Petersen, Jens; Wille, Mathilde M W; Thomsen, Laura H; Dirksen, Asger; de Bruijne, Marleen
2013-01-01
Statistical analysis of anatomical trees is hard to perform due to differences in the topological structure of the trees. In this paper we define statistical properties of leaf-labeled anatomical trees with geometric edge attributes by considering the anatomical trees as points in the geometric space of leaf-labeled trees. This tree-space is a geodesic metric space where any two trees are connected by a unique shortest path, which corresponds to a tree deformation. However, tree-space is not a manifold, and the usual strategy of performing statistical analysis in a tangent space and projecting onto tree-space is not available. Using tree-space and its shortest paths, a variety of statistical properties, such as mean, principal component, hypothesis testing and linear discriminant analysis can be defined. For some of these properties it is still an open problem how to compute them; others (like the mean) can be computed, but efficient alternatives are helpful in speeding up algorithms that use means iteratively, like hypothesis testing. In this paper, we take advantage of a very large dataset (N = 8016) to obtain computable approximations, under the assumption that the data trees parametrize the relevant parts of tree-space well. Using the developed approximate statistics, we illustrate how the structure and geometry of airway trees vary across a population and show that airway trees with Chronic Obstructive Pulmonary Disease come from a different distribution in tree-space than healthy ones. Software is available from http://image.diku.dk/aasa/software.php.
Statistical testing of association between menstruation and migraine.
Barra, Mathias; Dahl, Fredrik A; Vetvik, Kjersti G
2015-02-01
To repair and refine a previously proposed method for statistical analysis of association between migraine and menstruation. Menstrually related migraine (MRM) affects about 20% of female migraineurs in the general population. The exact pathophysiological link from menstruation to migraine is hypothesized to be through fluctuations in female reproductive hormones, but the exact mechanisms remain unknown. Therefore, the main diagnostic criterion today is concurrency of migraine attacks with menstruation. Methods aiming to exclude spurious associations are wanted, so that further research into these mechanisms can be performed on a population with a true association. The statistical method is based on a simple two-parameter null model of MRM (which allows for simulation modeling), and Fisher's exact test (with mid-p correction) applied to standard 2 × 2 contingency tables derived from the patients' headache diaries. Our method is a corrected version of a previously published flawed framework. To our best knowledge, no other published methods for establishing a menstruation-migraine association by statistical means exist today. The probabilistic methodology shows good performance when subjected to receiver operator characteristic curve analysis. Quick reference cutoff values for the clinical setting were tabulated for assessing association given a patient's headache history. In this paper, we correct a proposed method for establishing association between menstruation and migraine by statistical methods. We conclude that the proposed standard of 3-cycle observations prior to setting an MRM diagnosis should be extended with at least one perimenstrual window to obtain sufficient information for statistical processing. © 2014 American Headache Society.
Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values
Tong, Tiejun; Feng, Zeny; Hilton, Julia S.; Zhao, Hongyu
2013-01-01
Estimating the proportion of true null hypotheses, π0, has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π0 by incorporating the distribution pattern of the observed p-values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p-values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 − λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance. PMID:24078762
Estimating the Proportion of True Null Hypotheses Using the Pattern of Observed p-values.
Tong, Tiejun; Feng, Zeny; Hilton, Julia S; Zhao, Hongyu
2013-01-01
Estimating the proportion of true null hypotheses, π 0 , has attracted much attention in the recent statistical literature. Besides its apparent relevance for a set of specific scientific hypotheses, an accurate estimate of this parameter is key for many multiple testing procedures. Most existing methods for estimating π 0 in the literature are motivated from the independence assumption of test statistics, which is often not true in reality. Simulations indicate that most existing estimators in the presence of the dependence among test statistics can be poor, mainly due to the increase of variation in these estimators. In this paper, we propose several data-driven methods for estimating π 0 by incorporating the distribution pattern of the observed p -values as a practical approach to address potential dependence among test statistics. Specifically, we use a linear fit to give a data-driven estimate for the proportion of true-null p -values in (λ, 1] over the whole range [0, 1] instead of using the expected proportion at 1 - λ. We find that the proposed estimators may substantially decrease the variance of the estimated true null proportion and thus improve the overall performance.
Metrology Standards for Quantitative Imaging Biomarkers
Obuchowski, Nancy A.; Kessler, Larry G.; Raunig, David L.; Gatsonis, Constantine; Huang, Erich P.; Kondratovich, Marina; McShane, Lisa M.; Reeves, Anthony P.; Barboriak, Daniel P.; Guimaraes, Alexander R.; Wahl, Richard L.
2015-01-01
Although investigators in the imaging community have been active in developing and evaluating quantitative imaging biomarkers (QIBs), the development and implementation of QIBs have been hampered by the inconsistent or incorrect use of terminology or methods for technical performance and statistical concepts. Technical performance is an assessment of how a test performs in reference objects or subjects under controlled conditions. In this article, some of the relevant statistical concepts are reviewed, methods that can be used for evaluating and comparing QIBs are described, and some of the technical performance issues related to imaging biomarkers are discussed. More consistent and correct use of terminology and study design principles will improve clinical research, advance regulatory science, and foster better care for patients who undergo imaging studies. © RSNA, 2015 PMID:26267831
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gruendell, B.D.; Barrows, E.S.; Borde, A.B.
1997-01-01
The objective of the bioassay reevaluation of the Hackensack River Federal Project was to reperform toxicity testing on proposed dredged material with current ammonia reduction protocols. Hackensack River was one of four waterways sampled and evaluated for dredging and disposal in April 1993. Sediment samples were re-collected from the Hackensack River Project area in August 1995. Tests and analyses were conducted according to the manual developed by the USACE and the U.S. Environmental Protection Agency (EPA), Evaluation of Dredged Material Proposed for Ocean Disposal (Testing Manual), commonly referred to as the {open_quotes}Green Book,{close_quotes} and the regional manual developed by themore » USACE-NYD and EPA Region II, Guidance for Performing Tests on Dredged Material to be Disposed of in Ocean Waters. The reevaluation of proposed dredged material from the Hackensack River project area consisted of benthic acute toxicity tests. Thirty-three individual sediment core samples were collected from the Hackensack River project area. Three composite sediments, representing each reach of the area proposed for dredging, were used in benthic acute toxicity testing. Benthic acute toxicity tests were performed with the amphipod Ampelisca abdita and the mysid Mysidopsis bahia. The amphipod and mysid benthic toxicity test procedures followed EPA guidance for reduction of total ammonia concentrations in test systems prior to test initiation. Statistically significant acute toxicity was found in all three Hackensack River composites in the static renewal tests with A. abdita, but not in the static tests with M. bahia. Statistically significant acute toxicity and a greater than 20% increase in mortality over the reference sediment was found in the static renewal tests with A. abdita. Statistically significant mortality 10% over reference sediment was observed in the M. bahia static tests. 5 refs., 2 figs., 2 tabs.« less
Evaluation of dredged material proposed for ocean disposal from Arthur Kill Project Area, New York
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gruendell, B.D.; Barrows, E.S.; Borde, A.B.
1997-01-01
The objective of the bioassay reevaluation of Arthur Kill Federal Project was to reperform toxicity testing on proposed dredged material following current ammonia reduction protocols. Arthur Kill was one of four waterways sampled and evaluated for dredging and disposal in April 1993. Sediment samples were recollected from the Arthur Kill Project areas in August 1995. Tests and analyses were conducted according to the manual developed by the USACE and the U.S. Environmental Protection Agency (EPA), Evaluation of Dredged Material Proposed for Ocean Disposal (Testing Manual), commonly referred to as the {open_quotes}Green Book,{close_quotes} and the regional manual developed by the USACE-NYDmore » and EPA Region II, Guidance for Performing Tests on Dredged Material to be Disposed of in Ocean Waters. The reevaluation of proposed dredged material from the Arthur Kill project areas consisted of benthic acute toxicity tests. Thirty-three individual sediment core samples were collected from the Arthur Kill project area. Three composite sediments, representing each reach of the area proposed for dredging, was used in benthic acute toxicity testing. Benthic acute toxicity tests were performed with the amphipod Ampelisca abdita and the mysid Mysidopsis bahia. The amphipod and mysid benthic toxicity test procedures followed EPA guidance for reduction of total ammonia concentrations in test systems prior to test initiation. Statistically significant acute toxicity was found in all Arthur Kill composites in the static renewal tests with A. abdita, but not in the static tests with M. bahia. Statistically significant acute toxicity and a greater than 20% increase in mortality over the reference sediment was found in the static renewal tests with A. abdita. M. bahia did not show statistically significant acute toxicity or a greater than 10% increase in mortality over reference sediment in static tests. 5 refs., 2 figs., 2 tabs.« less
Built-In-Test Equipment Requirements Workshop. Workshop Presentation
1981-08-01
quantitatively evaluated in test. (2) It is necessary to develop the statistical methods that should be used for predicting and confirming of diagnostic...of different performance levels of BIT peacetime and wartime applications, and the corresponding manpower and other support requirements should be...reports. The scope of the workshop involves the areas of require- ments for built-in-test and diagnostics, and the methods of testing to ensure that the
Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob
2016-08-01
In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).
A study of the relationship between depression symptom and physical performance in elderly women.
Lee, Yang Chool
2015-12-01
Depression is a general public health problem; there is an association between regular exercise or vigorous physical activity and depression. Physical activity has positive physical, mental, and emotional effects. The purpose of this study was to examine the relationship between depression symptom and physical performance in elderly women. A total of 173 elderly women aged 65 to 80 participated in this study. We evaluated elderly women using the 6-min walk, grip-strength, 30-sec arm curl, 30-sec chair stand, 8-foot up and go, back scratch, and chair sit and reach, and unipedal stance, measured the body mass index (BMI), and depression symptom assessed using Korean version of the Geriatric Depression Scale (GDS-K). The collected data were analyzed using descriptive statistics, correlation analysis, paired t-tests, and simple linear regression using IBM SPSS Statistics ver. 21.0. There were significant correlations between GDS-K and the 6-min walk, 30-sec chair stand, 30-sec arm curl, chair sit and reach, 8-foot up and go, and grip strength tests (P<0.05), but not BMI, back strength, and unipedal stance. When divided into two groups (GDS-K score≥14 and GDS-K score<14), there was a difference between the two groups in the 6-min walk, 30-sec chair stand, 30-sec arm curl test, chair sit and reach, 8-foot up and go test, and grip strength test performances. Physical performance factors were strongly associated with depression symptom, suggesting that physical performance improvements may play an important role in preventing depression.
A study of the relationship between depression symptom and physical performance in elderly women
Lee, Yang Chool
2015-01-01
Depression is a general public health problem; there is an association between regular exercise or vigorous physical activity and depression. Physical activity has positive physical, mental, and emotional effects. The purpose of this study was to examine the relationship between depression symptom and physical performance in elderly women. A total of 173 elderly women aged 65 to 80 participated in this study. We evaluated elderly women using the 6-min walk, grip-strength, 30-sec arm curl, 30-sec chair stand, 8-foot up and go, back scratch, and chair sit and reach, and unipedal stance, measured the body mass index (BMI), and depression symptom assessed using Korean version of the Geriatric Depression Scale (GDS-K). The collected data were analyzed using descriptive statistics, correlation analysis, paired t-tests, and simple linear regression using IBM SPSS Statistics ver. 21.0. There were significant correlations between GDS-K and the 6-min walk, 30-sec chair stand, 30-sec arm curl, chair sit and reach, 8-foot up and go, and grip strength tests (P<0.05), but not BMI, back strength, and unipedal stance. When divided into two groups (GDS-K score≥14 and GDS-K score<14), there was a difference between the two groups in the 6-min walk, 30-sec chair stand, 30-sec arm curl test, chair sit and reach, 8-foot up and go test, and grip strength test performances. Physical performance factors were strongly associated with depression symptom, suggesting that physical performance improvements may play an important role in preventing depression. PMID:26730389
Brain imaging and cognition in young narcoleptic patients.
Huang, Yu-Shu; Liu, Feng-Yuan; Lin, Chin-Yang; Hsiao, Ing-Tsung; Guilleminault, Christian
2016-08-01
The relationship between functional brain images and performances in narcoleptic patients and controls is a new field of investigation. We studied 71 young, type 1 narcoleptic patients and 20 sex- and age-matched control individuals using brain positron emission tomography (PET) images and neurocognitive testing. Clinical investigation was carried out using sleep-wake evaluation questionnaires; a sleep-wake study was conducted with actigraphy, polysomnography, multiple sleep latency test (MSLT), and blood tests (with human leukocyte antigen typing). The continuous performance test (CPT) and Wisconsin card sorting test (WCST) were administered on the same day as the PET study. PET data were analyzed using Statistical Parametric Mapping (version 8) software. Correlation of brain imaging and neurocognitive function was performed by Pearson's correlation. Statistical analyses (Student's t-test) were conducted with SPSS version-18. Seventy-one narcoleptic patients (mean age: 16.15 years, 41 boys (57.7%)) and 20 controls (mean age: 15.1 years, 12 boys (60%)) were studied. Results from the CPT and WCST showed significantly worse scores in narcoleptic patients than in controls (P < 0.05). Compared to controls, narcoleptic patients presented with hypometabolism in the right mid-frontal lobe and angular gyrus (P < 0.05) and significant hypermetabolism in the olfactory lobe, hippocampus, parahippocampus, amygdala, fusiform, left inferior parietal lobe, left superior temporal lobe, striatum, basal ganglia and thalamus, right hypothalamus, and pons (P < 0.05) in the PET study. Changes in brain metabolic activity in narcoleptic patients were positively correlated with results from the sleepiness scales and performance tests. Young, type 1 narcoleptic patients face a continuous cognitive handicap. Our imaging cognitive test protocol can be useful for investigating the effects of treatment trials in these patients. Copyright © 2016 Elsevier B.V. All rights reserved.
Association factor analysis between osteoporosis with cerebral artery disease: The STROBE study.
Jin, Eun-Sun; Jeong, Je Hoon; Lee, Bora; Im, Soo Bin
2017-03-01
The purpose of this study was to determine the clinical association factors between osteoporosis and cerebral artery disease in Korean population. Two hundred nineteen postmenopausal women and men undergoing cerebral computed tomography angiography were enrolled in this study to evaluate the cerebral artery disease by cross-sectional study. Cerebral artery disease was diagnosed if there was narrowing of 50% higher diameter in one or more cerebral vessel artery or presence of vascular calcification. History of osteoporotic fracture was assessed using medical record, and radiographic data such as simple radiography, MRI, and bone scan. Bone mineral density was checked by dual-energy x-ray absorptiometry. We reviewed clinical characteristics in all patients and also performed subgroup analysis for total or extracranial/ intracranial cerebral artery disease group retrospectively. We performed statistical analysis by means of chi-square test or Fisher's exact test for categorical variables and Student's t-test or Wilcoxon's rank sum test for continuous variables. We also used univariate and multivariate logistic regression analyses were conducted to assess the factors associated with the prevalence of cerebral artery disease. A two-tailed p-value of less than 0.05 was considered as statistically significant. All statistical analyses were performed using R (version 3.1.3; The R Foundation for Statistical Computing, Vienna, Austria) and SPSS (version 14.0; SPSS, Inc, Chicago, Ill, USA). Of the 219 patients, 142 had cerebral artery disease. All vertebral fracture was observed in 29 (13.24%) patients. There was significant difference in hip fracture according to the presence or absence of cerebral artery disease. In logistic regression analysis, osteoporotic hip fracture was significantly associated with extracranial cerebral artery disease after adjusting for multiple risk factors. Females with osteoporotic hip fracture were associated with total calcified cerebral artery disease. Some clinical factors such as age, hypertension, and osteoporotic hip fracture, smoking history and anti-osteoporosis drug use were associated with cerebral artery disease.
Statistical Tests of System Linearity Based on the Method of Surrogate Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hunter, N.; Paez, T.; Red-Horse, J.
When dealing with measured data from dynamic systems we often make the tacit assumption that the data are generated by linear dynamics. While some systematic tests for linearity and determinism are available - for example the coherence fimction, the probability density fimction, and the bispectrum - fi,u-ther tests that quanti$ the existence and the degree of nonlinearity are clearly needed. In this paper we demonstrate a statistical test for the nonlinearity exhibited by a dynamic system excited by Gaussian random noise. We perform the usual division of the input and response time series data into blocks as required by themore » Welch method of spectrum estimation and search for significant relationships between a given input fkequency and response at harmonics of the selected input frequency. We argue that systematic tests based on the recently developed statistical method of surrogate data readily detect significant nonlinear relationships. The paper elucidates the method of surrogate data. Typical results are illustrated for a linear single degree-of-freedom system and for a system with polynomial stiffness nonlinearity.« less
Predicting Slag Generation in Sub-Scale Test Motors Using a Neural Network
NASA Technical Reports Server (NTRS)
Wiesenberg, Brent
1999-01-01
Generation of slag (aluminum oxide) is an important issue for the Reusable Solid Rocket Motor (RSRM). Thiokol performed testing to quantify the relationship between raw material variations and slag generation in solid propellants by testing sub-scale motors cast with propellant containing various combinations of aluminum fuel and ammonium perchlorate (AP) oxidizer particle sizes. The test data were analyzed using statistical methods and an artificial neural network. This paper primarily addresses the neural network results with some comparisons to the statistical results. The neural network showed that the particle sizes of both the aluminum and unground AP have a measurable effect on slag generation. The neural network analysis showed that aluminum particle size is the dominant driver in slag generation, about 40% more influential than AP. The network predictions of the amount of slag produced during firing of sub-scale motors were 16% better than the predictions of a statistically derived empirical equation. Another neural network successfully characterized the slag generated during full-scale motor tests. The success is attributable to the ability of neural networks to characterize multiple complex factors including interactions that affect slag generation.
Testing non-inferiority of a new treatment in three-arm clinical trials with binary endpoints.
Tang, Nian-Sheng; Yu, Bin; Tang, Man-Lai
2014-12-18
A two-arm non-inferiority trial without a placebo is usually adopted to demonstrate that an experimental treatment is not worse than a reference treatment by a small pre-specified non-inferiority margin due to ethical concerns. Selection of the non-inferiority margin and establishment of assay sensitivity are two major issues in the design, analysis and interpretation for two-arm non-inferiority trials. Alternatively, a three-arm non-inferiority clinical trial including a placebo is usually conducted to assess the assay sensitivity and internal validity of a trial. Recently, some large-sample approaches have been developed to assess the non-inferiority of a new treatment based on the three-arm trial design. However, these methods behave badly with small sample sizes in the three arms. This manuscript aims to develop some reliable small-sample methods to test three-arm non-inferiority. Saddlepoint approximation, exact and approximate unconditional, and bootstrap-resampling methods are developed to calculate p-values of the Wald-type, score and likelihood ratio tests. Simulation studies are conducted to evaluate their performance in terms of type I error rate and power. Our empirical results show that the saddlepoint approximation method generally behaves better than the asymptotic method based on the Wald-type test statistic. For small sample sizes, approximate unconditional and bootstrap-resampling methods based on the score test statistic perform better in the sense that their corresponding type I error rates are generally closer to the prespecified nominal level than those of other test procedures. Both approximate unconditional and bootstrap-resampling test procedures based on the score test statistic are generally recommended for three-arm non-inferiority trials with binary outcomes.
Waites, Anthony B; Mannfolk, Peter; Shaw, Marnie E; Olsrud, Johan; Jackson, Graeme D
2007-02-01
Clinical functional magnetic resonance imaging (fMRI) occasionally fails to detect significant activation, often due to variability in task performance. The present study seeks to test whether a more flexible statistical analysis can better detect activation, by accounting for variance associated with variable compliance to the task over time. Experimental results and simulated data both confirm that even at 80% compliance to the task, such a flexible model outperforms standard statistical analysis when assessed using the extent of activation (experimental data), goodness of fit (experimental data), and area under the operator characteristic curve (simulated data). Furthermore, retrospective examination of 14 clinical fMRI examinations reveals that in patients where the standard statistical approach yields activation, there is a measurable gain in model performance in adopting the flexible statistical model, with little or no penalty in lost sensitivity. This indicates that a flexible model should be considered, particularly for clinical patients who may have difficulty complying fully with the study task.
Evaluation of PCR Systems for Field Screening of Bacillus anthracis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ozanich, Richard M.; Colburn, Heather A.; Victry, Kristin D.
There is little published data on the performance of hand-portable polymerase chain reaction (PCR) instruments that could be used by first responders to determine if a suspicious powder contains a potential biothreat agent. We evaluated five commercially available hand-portable PCR instruments for detection of Bacillus anthracis (Ba). We designed a cost-effective, statistically-based test plan that allows instruments to be evaluated at performance levels ranging from 0.85-0.95 lower confidence bound (LCB) on the probability of detection (POD) at confidence levels of 80-95%. We assessed specificity using purified genomic DNA from 13 Ba strains and 18 Bacillus near neighbors, interference with 22more » common hoax powders encountered in the field, and PCR inhibition when Ba spores were spiked into these powders. Our results indicated that three of the five instruments achieved >0.95 LCB on the POD with 95% confidence at test concentrations of 2,000 genome equivalents/mL (comparable to 2,000 spores/mL), displaying more than sufficient sensitivity for screening suspicious powders. These instruments exhibited no false positive results or PCR inhibition with common hoax powders, and reliably detected Ba spores spiked into common hoax powders, though some issues with instrument controls were observed. Our testing approach enables efficient instrument performance testing to a statistically rigorous and cost-effective test plan to generate performance data that will allow users to make informed decisions regarding the purchase and use of biodetection equipment in the field.« less
ERIC Educational Resources Information Center
Kim, Seonghoon; Feldt, Leonard S.
2010-01-01
The primary purpose of this study is to investigate the mathematical characteristics of the test reliability coefficient rho[subscript XX'] as a function of item response theory (IRT) parameters and present the lower and upper bounds of the coefficient. Another purpose is to examine relative performances of the IRT reliability statistics and two…
A statistical framework for evaluating neural networks to predict recurrent events in breast cancer
NASA Astrophysics Data System (ADS)
Gorunescu, Florin; Gorunescu, Marina; El-Darzi, Elia; Gorunescu, Smaranda
2010-07-01
Breast cancer is the second leading cause of cancer deaths in women today. Sometimes, breast cancer can return after primary treatment. A medical diagnosis of recurrent cancer is often a more challenging task than the initial one. In this paper, we investigate the potential contribution of neural networks (NNs) to support health professionals in diagnosing such events. The NN algorithms are tested and applied to two different datasets. An extensive statistical analysis has been performed to verify our experiments. The results show that a simple network structure for both the multi-layer perceptron and radial basis function can produce equally good results, not all attributes are needed to train these algorithms and, finally, the classification performances of all algorithms are statistically robust. Moreover, we have shown that the best performing algorithm will strongly depend on the features of the datasets, and hence, there is not necessarily a single best classifier.
Knowledge dimensions in hypothesis test problems
NASA Astrophysics Data System (ADS)
Krishnan, Saras; Idris, Noraini
2012-05-01
The reformation in statistics education over the past two decades has predominantly shifted the focus of statistical teaching and learning from procedural understanding to conceptual understanding. The emphasis of procedural understanding is on the formulas and calculation procedures. Meanwhile, conceptual understanding emphasizes students knowing why they are using a particular formula or executing a specific procedure. In addition, the Revised Bloom's Taxonomy offers a twodimensional framework to describe learning objectives comprising of the six revised cognition levels of original Bloom's taxonomy and four knowledge dimensions. Depending on the level of complexities, the four knowledge dimensions essentially distinguish basic understanding from the more connected understanding. This study identifiesthe factual, procedural and conceptual knowledgedimensions in hypothesis test problems. Hypothesis test being an important tool in making inferences about a population from sample informationis taught in many introductory statistics courses. However, researchers find that students in these courses still have difficulty in understanding the underlying concepts of hypothesis test. Past studies also show that even though students can perform the hypothesis testing procedure, they may not understand the rationale of executing these steps or know how to apply them in novel contexts. Besides knowing the procedural steps in conducting a hypothesis test, students must have fundamental statistical knowledge and deep understanding of the underlying inferential concepts such as sampling distribution and central limit theorem. By identifying the knowledge dimensions of hypothesis test problems in this study, suitable instructional and assessment strategies can be developed in future to enhance students' learning of hypothesis test as a valuable inferential tool.
Statistical assessment of the learning curves of health technologies.
Ramsay, C R; Grant, A M; Wallace, S A; Garthwaite, P H; Monk, A F; Russell, I T
2001-01-01
(1) To describe systematically studies that directly assessed the learning curve effect of health technologies. (2) Systematically to identify 'novel' statistical techniques applied to learning curve data in other fields, such as psychology and manufacturing. (3) To test these statistical techniques in data sets from studies of varying designs to assess health technologies in which learning curve effects are known to exist. METHODS - STUDY SELECTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): For a study to be included, it had to include a formal analysis of the learning curve of a health technology using a graphical, tabular or statistical technique. METHODS - STUDY SELECTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): For a study to be included, it had to include a formal assessment of a learning curve using a statistical technique that had not been identified in the previous search. METHODS - DATA SOURCES: Six clinical and 16 non-clinical biomedical databases were searched. A limited amount of handsearching and scanning of reference lists was also undertaken. METHODS - DATA EXTRACTION (HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW): A number of study characteristics were abstracted from the papers such as study design, study size, number of operators and the statistical method used. METHODS - DATA EXTRACTION (NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH): The new statistical techniques identified were categorised into four subgroups of increasing complexity: exploratory data analysis; simple series data analysis; complex data structure analysis, generic techniques. METHODS - TESTING OF STATISTICAL METHODS: Some of the statistical methods identified in the systematic searches for single (simple) operator series data and for multiple (complex) operator series data were illustrated and explored using three data sets. The first was a case series of 190 consecutive laparoscopic fundoplication procedures performed by a single surgeon; the second was a case series of consecutive laparoscopic cholecystectomy procedures performed by ten surgeons; the third was randomised trial data derived from the laparoscopic procedure arm of a multicentre trial of groin hernia repair, supplemented by data from non-randomised operations performed during the trial. RESULTS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: Of 4571 abstracts identified, 272 (6%) were later included in the study after review of the full paper. Some 51% of studies assessed a surgical minimal access technique and 95% were case series. The statistical method used most often (60%) was splitting the data into consecutive parts (such as halves or thirds), with only 14% attempting a more formal statistical analysis. The reporting of the studies was poor, with 31% giving no details of data collection methods. RESULTS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: Of 9431 abstracts assessed, 115 (1%) were deemed appropriate for further investigation and, of these, 18 were included in the study. All of the methods for complex data sets were identified in the non-clinical literature. These were discriminant analysis, two-stage estimation of learning rates, generalised estimating equations, multilevel models, latent curve models, time series models and stochastic parameter models. In addition, eight new shapes of learning curves were identified. RESULTS - TESTING OF STATISTICAL METHODS: No one particular shape of learning curve performed significantly better than another. The performance of 'operation time' as a proxy for learning differed between the three procedures. Multilevel modelling using the laparoscopic cholecystectomy data demonstrated and measured surgeon-specific and confounding effects. The inclusion of non-randomised cases, despite the possible limitations of the method, enhanced the interpretation of learning effects. CONCLUSIONS - HEALTH TECHNOLOGY ASSESSMENT LITERATURE REVIEW: The statistical methods used for assessing learning effects in health technology assessment have been crude and the reporting of studies poor. CONCLUSIONS - NON-HEALTH TECHNOLOGY ASSESSMENT LITERATURE SEARCH: A number of statistical methods for assessing learning effects were identified that had not hitherto been used in health technology assessment. There was a hierarchy of methods for the identification and measurement of learning, and the more sophisticated methods for both have had little if any use in health technology assessment. This demonstrated the value of considering fields outside clinical research when addressing methodological issues in health technology assessment. CONCLUSIONS - TESTING OF STATISTICAL METHODS: It has been demonstrated that the portfolio of techniques identified can enhance investigations of learning curve effects. (ABSTRACT TRUNCATED)
Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of “Flavescence dorée” (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes’ theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods. PMID:28384335
Chabirand, Aude; Loiseau, Marianne; Renaudin, Isabelle; Poliakoff, Françoise
2017-01-01
A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of "Flavescence dorée" (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes' theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods.
Risk-based Methodology for Validation of Pharmaceutical Batch Processes.
Wiles, Frederick
2013-01-01
In January 2011, the U.S. Food and Drug Administration published new process validation guidance for pharmaceutical processes. The new guidance debunks the long-held industry notion that three consecutive validation batches or runs are all that are required to demonstrate that a process is operating in a validated state. Instead, the new guidance now emphasizes that the level of monitoring and testing performed during process performance qualification (PPQ) studies must be sufficient to demonstrate statistical confidence both within and between batches. In some cases, three qualification runs may not be enough. Nearly two years after the guidance was first published, little has been written defining a statistical methodology for determining the number of samples and qualification runs required to satisfy Stage 2 requirements of the new guidance. This article proposes using a combination of risk assessment, control charting, and capability statistics to define the monitoring and testing scheme required to show that a pharmaceutical batch process is operating in a validated state. In this methodology, an assessment of process risk is performed through application of a process failure mode, effects, and criticality analysis (PFMECA). The output of PFMECA is used to select appropriate levels of statistical confidence and coverage which, in turn, are used in capability calculations to determine when significant Stage 2 (PPQ) milestones have been met. The achievement of Stage 2 milestones signals the release of batches for commercial distribution and the reduction of monitoring and testing to commercial production levels. Individuals, moving range, and range/sigma charts are used in conjunction with capability statistics to demonstrate that the commercial process is operating in a state of statistical control. The new process validation guidance published by the U.S. Food and Drug Administration in January of 2011 indicates that the number of process validation batches or runs required to demonstrate that a pharmaceutical process is operating in a validated state should be based on sound statistical principles. The old rule of "three consecutive batches and you're done" is no longer sufficient. The guidance, however, does not provide any specific methodology for determining the number of runs required, and little has been published to augment this shortcoming. The paper titled "Risk-based Methodology for Validation of Pharmaceutical Batch Processes" describes a statistically sound methodology for determining when a statistically valid number of validation runs has been acquired based on risk assessment and calculation of process capability.
Engineering evaluation of SSME dynamic data from engine tests and SSV flights
NASA Technical Reports Server (NTRS)
1986-01-01
An engineering evaluation of dynamic data from SSME hot firing tests and SSV flights is summarized. The basic objective of the study is to provide analyses of vibration, strain and dynamic pressure measurements in support of MSFC performance and reliability improvement programs. A brief description of the SSME test program is given and a typical test evaluation cycle reviewed. Data banks generated to characterize SSME component dynamic characteristics are described and statistical analyses performed on these data base measurements are discussed. Analytical models applied to define the dynamic behavior of SSME components (such as turbopump bearing elements and the flight accelerometer safety cut-off system) are also summarized. Appendices are included to illustrate some typical tasks performed under this study.
Moshtagh-Khorasani, Majid; Akbarzadeh-T, Mohammad-R; Jahangiri, Nader; Khoobdel, Mehdi
2009-01-01
BACKGROUND: Aphasia diagnosis is particularly challenging due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. METHODS: Fuzzy probability is proposed here as the basic framework for handling the uncertainties in medical diagnosis and particularly aphasia diagnosis. To efficiently construct this fuzzy probabilistic mapping, statistical analysis is performed that constructs input membership functions as well as determines an effective set of input features. RESULTS: Considering the high sensitivity of performance measures to different distribution of testing/training sets, a statistical t-test of significance is applied to compare fuzzy approach results with NN results as well as author's earlier work using fuzzy logic. The proposed fuzzy probability estimator approach clearly provides better diagnosis for both classes of data sets. Specifically, for the first and second type of fuzzy probability classifiers, i.e. spontaneous speech and comprehensive model, P-values are 2.24E-08 and 0.0059, respectively, strongly rejecting the null hypothesis. CONCLUSIONS: The technique is applied and compared on both comprehensive and spontaneous speech test data for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. Statistical analysis confirms that the proposed approach can significantly improve accuracy using fewer Aphasia features. PMID:21772867
Zhang, Yi; Chen, Lihan
2016-01-01
Recent studies of brain plasticity that pertain to time perception have shown that fast training of temporal discrimination in one modality, for example, the auditory modality, can improve performance of temporal discrimination in another modality, such as the visual modality. We here examined whether the perception of visual Ternus motion could be recalibrated through fast crossmodal statistical binding of temporal information and stimuli properties binding. We conducted two experiments, composed of three sessions each: pre-test, learning, and post-test. In both the pre-test and the post-test, participants classified the Ternus display as either “element motion” or “group motion.” For the training session in Experiment 1, we constructed two types of temporal structures, in which two consecutively presented sound beeps were dominantly (80%) flanked by one leading visual Ternus frame and by one lagging visual Ternus frame (VAAV) or dominantly inserted by two Ternus visual frames (AVVA). Participants were required to respond which interval (auditory vs. visual) was longer. In Experiment 2, we presented only a single auditory–visual pair but with similar temporal configurations as in Experiment 1, and asked participants to perform an audio–visual temporal order judgment. The results of these two experiments support that statistical binding of temporal information and stimuli properties can quickly and selectively recalibrate the sensitivity of perceiving visual motion, according to the protocols of the specific bindings. PMID:27065910
Maurício, Sílvia Fernandes; da Silva, Jacqueline Braga; Bering, Tatiana; Correia, Maria Isabel Toulson Davisson
2013-04-01
The association between nutritional status and inflammation was assessed in patients with colorectal cancer and to verify their association with complications during anticancer treatment. The agreement between the Subjective Global Assessment (SGA) and different nutritional assessment methods was also evaluated. A cross-sectional, prospective, and descriptive study was performed. The nutritional status was defined by the SGA and the severity of inflammation was defined by the Glasgow Prognostic Score (GPS). The complications were classified using the Common Toxicity Criteria, version 3. Anthropometric measurements such as body mass index, triceps skinfold, midarm circumference, midarm muscle area, and adductor pollicis muscle thickness were also performed, as were handgrip strength and phase angle. The chi-square test, Fisher exact test, Spearman correlation coefficient, independent t test, analysis of variance, Gabriel test, and κ index were used for the statistical analysis. P < 0.05 was considered statistically significant. Seventy patients with colorectal cancer (60.4 ± 14.3 y old) were included. The nutritional status according to the SGA was associated with the GPS (P < 0.05), but the SGA and GPS were not related to the presence of complications. When comparing the different nutritional assessment methods with the SGA, there were statistically significant differences. Malnutrition is highly prevalent in patients with colorectal cancer. The nutritional status was associated with the GPS. Copyright © 2013 Elsevier Inc. All rights reserved.
Hemodynamic monitoring of middle cerebral arteries during cognitive tasks performance.
Boban, Marina; Črnac, Petra; Junaković, Anamari; Malojčić, Branko
2014-11-01
The aim of this study was to obtain temporal pattern and hemispheric dominance of blood flow velocity (BFV) changes and to assess suitability of different cognitive tasks for monitoring of BFV changes in the middle cerebral arteries (MCA). BFV were recorded simultaneously in MCA during performance of phonemic verbal fluency test (pVFT), Trail Making Tests A and B (TMTA and TMTB) and Stroop tests in 14 healthy, right-handed volunteers aged 20-26 years. A significant increase of BFV in both MCA was obtained during performance of all cognitive tasks. Statistically significant lateralization was found during performance of Stroop test with incongruent stimuli, while TMTB was found to have the best activation potential for MCA. Our findings specify TMTB as the most suitable cognitive test for monitoring of BFV in MCA. © 2014 The Authors. Psychiatry and Clinical Neurosciences © 2014 Japanese Society of Psychiatry and Neurology.
Ryu, Ehri; Cheong, Jeewon
2017-01-01
In this article, we evaluated the performance of statistical methods in single-group and multi-group analysis approaches for testing group difference in indirect effects and for testing simple indirect effects in each group. We also investigated whether the performance of the methods in the single-group approach was affected when the assumption of equal variance was not satisfied. The assumption was critical for the performance of the two methods in the single-group analysis: the method using a product term for testing the group difference in a single path coefficient, and the Wald test for testing the group difference in the indirect effect. Bootstrap confidence intervals in the single-group approach and all methods in the multi-group approach were not affected by the violation of the assumption. We compared the performance of the methods and provided recommendations. PMID:28553248
LOGICAL REASONING ABILITY AND STUDENT PERFORMANCE IN GENERAL CHEMISTRY.
Bird, Lillian
2010-03-01
Logical reasoning skills of students enrolled in General Chemistry at the University of Puerto Rico in Río Piedras were measured using the Group Assessment of Logical Thinking (GALT) test. The results were used to determine the students' cognitive level (concrete, transitional, formal) as well as their level of performance by logical reasoning mode (mass/volume conservation, proportional reasoning, correlational reasoning, experimental variable control, probabilistic reasoning and combinatorial reasoning). This information was used to identify particular deficiencies and gender effects, and to determine which logical reasoning modes were the best predictors of student performance in the general chemistry course. Statistical tests to analyze the relation between (a) operational level and final grade in both semesters of the course; (b) GALT test results and performance in the ACS General Chemistry Examination; and (c) operational level and student approach (algorithmic or conceptual) towards a test question that may be answered correctly using either strategy, were also performed.
Lockwood, Alan H; Weissenborn, Karin; Bokemeyer, Martin; Tietge, U; Burchert, Wolfgang
2002-03-01
Many cirrhotics have abnormal neuropsychological test scores. To define the anatomical-physiological basis for encephalopathy in nonalcoholic cirrhotics, we performed resting-state fluorodeoxyglucose positron emission tomographic scans and administered a neuropsychological test battery to 18 patients and 10 controls. Statistical parametric mapping correlated changes in regional glucose metabolism with performance on the individual tests and a composite battery score. In patients without overt encephalopathy, poor performance correlated with reductions in metabolism in the anterior cingulate. In all patients, poor performance on the battery was positively correlated (p < 0.001) with glucose metabolism in bifrontal and biparietal regions of the cerebral cortex and negatively correlated with metabolism in hippocampal, lingual, and fusiform gyri and the posterior putamen. Similar patterns of abnormal metabolism were found when comparing the patients to 10 controls. Metabolic abnormalities in the anterior attention system and association cortices mediating executive and integrative function form the pathophysiological basis for mild hepatic encephalopathy.
LOGICAL REASONING ABILITY AND STUDENT PERFORMANCE IN GENERAL CHEMISTRY
Bird, Lillian
2010-01-01
Logical reasoning skills of students enrolled in General Chemistry at the University of Puerto Rico in Río Piedras were measured using the Group Assessment of Logical Thinking (GALT) test. The results were used to determine the students’ cognitive level (concrete, transitional, formal) as well as their level of performance by logical reasoning mode (mass/volume conservation, proportional reasoning, correlational reasoning, experimental variable control, probabilistic reasoning and combinatorial reasoning). This information was used to identify particular deficiencies and gender effects, and to determine which logical reasoning modes were the best predictors of student performance in the general chemistry course. Statistical tests to analyze the relation between (a) operational level and final grade in both semesters of the course; (b) GALT test results and performance in the ACS General Chemistry Examination; and (c) operational level and student approach (algorithmic or conceptual) towards a test question that may be answered correctly using either strategy, were also performed. PMID:21373364
Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie
2016-01-01
Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31-0.89] (P value = 0.009). Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies.
Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie
2016-01-01
Background Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. Methods From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. Results 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31–0.89] (P value = 0.009). Conclusion Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies. PMID:27716793
Acceptability of HIV/AIDS testing among pre-marital couples in Iran (2012)
Ayatollahi, Jamshid; Nasab Sarab, Mohammad Ali Bagheri; Sharifi, Mohammad Reza; Shahcheraghi, Seyed Hossein
2014-01-01
Background: Human immunodeficiency virus (HIV)/acquired immune deficiency syndrome (AIDS) is a lifestyle-related disease. This disease is transmitted through unprotected sex, contaminated needles, infected blood transfusion and from mother to child during pregnancy and delivery. Prevention of infection with HIV, mainly through safe sex and needle exchange programmes is a solution to prevent the spread of the disease. Knowledge about HIV state helps to prevent and subsequently reduce the harm to the later generation. The purpose of this study was to assess the willingness rate of couples referred to the family regulation pre-marital counselling centre for performing HIV test before marriage in Yazd. Patients and Methods: In this descriptive study, a simple random sampling was done among people referred to Akbari clinic. The couples were 1000 men and 1000 women referred to the premarital counselling centre for pre-marital HIV testing in Yazd in the year 2012. They were in situations of pregnancy, delivery or nursing and milking. The data were analyzed using Statistical Package for the Social Sciences (SPSS) software and chi-square statistical test. Results: There was a significant statistical difference between the age groups about willingness for HIV testing before marriage (P < 0.001) and also positive comments about HIV testing in asymptomatic individuals (P < 0.001). This study also proved a significant statistical difference between the two gender groups about willingness to marry after HIV positive test of their wives. Conclusion: The willingness rate of couples to undergo HIV testing before marriage was significant. Therefore, HIV testing before marriage as a routine test was suggested. PMID:25114363
Acceptability of HIV/AIDS testing among pre-marital couples in Iran (2012).
Ayatollahi, Jamshid; Nasab Sarab, Mohammad Ali Bagheri; Sharifi, Mohammad Reza; Shahcheraghi, Seyed Hossein
2014-07-01
Human immunodeficiency virus (HIV)/acquired immune deficiency syndrome (AIDS) is a lifestyle-related disease. This disease is transmitted through unprotected sex, contaminated needles, infected blood transfusion and from mother to child during pregnancy and delivery. Prevention of infection with HIV, mainly through safe sex and needle exchange programmes is a solution to prevent the spread of the disease. Knowledge about HIV state helps to prevent and subsequently reduce the harm to the later generation. The purpose of this study was to assess the willingness rate of couples referred to the family regulation pre-marital counselling centre for performing HIV test before marriage in Yazd. In this descriptive study, a simple random sampling was done among people referred to Akbari clinic. The couples were 1000 men and 1000 women referred to the premarital counselling centre for pre-marital HIV testing in Yazd in the year 2012. They were in situations of pregnancy, delivery or nursing and milking. The data were analyzed using Statistical Package for the Social Sciences (SPSS) software and chi-square statistical test. There was a significant statistical difference between the age groups about willingness for HIV testing before marriage (P < 0.001) and also positive comments about HIV testing in asymptomatic individuals (P < 0.001). This study also proved a significant statistical difference between the two gender groups about willingness to marry after HIV positive test of their wives. The willingness rate of couples to undergo HIV testing before marriage was significant. Therefore, HIV testing before marriage as a routine test was suggested.
Statistical alignment: computational properties, homology testing and goodness-of-fit.
Hein, J; Wiuf, C; Knudsen, B; Møller, M B; Wibling, G
2000-09-08
The model of insertions and deletions in biological sequences, first formulated by Thorne, Kishino, and Felsenstein in 1991 (the TKF91 model), provides a basis for performing alignment within a statistical framework. Here we investigate this model.Firstly, we show how to accelerate the statistical alignment algorithms several orders of magnitude. The main innovations are to confine likelihood calculations to a band close to the similarity based alignment, to get good initial guesses of the evolutionary parameters and to apply an efficient numerical optimisation algorithm for finding the maximum likelihood estimate. In addition, the recursions originally presented by Thorne, Kishino and Felsenstein can be simplified. Two proteins, about 1500 amino acids long, can be analysed with this method in less than five seconds on a fast desktop computer, which makes this method practical for actual data analysis.Secondly, we propose a new homology test based on this model, where homology means that an ancestor to a sequence pair can be found finitely far back in time. This test has statistical advantages relative to the traditional shuffle test for proteins.Finally, we describe a goodness-of-fit test, that allows testing the proposed insertion-deletion (indel) process inherent to this model and find that real sequences (here globins) probably experience indels longer than one, contrary to what is assumed by the model. Copyright 2000 Academic Press.
Daikoku, Tatsuya; Takahashi, Yuji; Futagami, Hiroko; Tarumoto, Nagayoshi; Yasuda, Hideki
2017-02-01
In real-world auditory environments, humans are exposed to overlapping auditory information such as those made by human voices and musical instruments even during routine physical activities such as walking and cycling. The present study investigated how concurrent physical exercise affects performance of incidental and intentional learning of overlapping auditory streams, and whether physical fitness modulates the performances of learning. Participants were grouped with 11 participants with lower and higher fitness each, based on their Vo 2 max value. They were presented simultaneous auditory sequences with a distinct statistical regularity each other (i.e. statistical learning), while they were pedaling on the bike and seating on a bike at rest. In experiment 1, they were instructed to attend to one of the two sequences and ignore to the other sequence. In experiment 2, they were instructed to attend to both of the two sequences. After exposure to the sequences, learning effects were evaluated by familiarity test. In the experiment 1, performance of statistical learning of ignored sequences during concurrent pedaling could be higher in the participants with high than low physical fitness, whereas in attended sequence, there was no significant difference in performance of statistical learning between high than low physical fitness. Furthermore, there was no significant effect of physical fitness on learning while resting. In the experiment 2, the both participants with high and low physical fitness could perform intentional statistical learning of two simultaneous sequences in the both exercise and rest sessions. The improvement in physical fitness might facilitate incidental but not intentional statistical learning of simultaneous auditory sequences during concurrent physical exercise.
Eskildsen, Anita; Andersen, Lars Peter; Pedersen, Anders Degn; Vandborg, Sanne Kjær; Andersen, Johan Hviid
2015-01-01
Patients on sick leave due to work-related stress often complain about impaired concentration and memory. However, it is undetermined how widespread these impairments are, and which cognitive domains are most long-term stress sensitive. Previous studies show inconsistent results and are difficult to synthesize. The primary aim of this study was to examine whether patients with work-related stress complaints have cognitive impairments compared to a matched control group without stress. Our secondary aim was to examine whether the level of self-reported perceived stress is associated with neuropsychological test performance. We used a broad neuropsychological test battery to assess 59 outpatients with work-related stress complaints (without major depression) and 59 healthy controls. We matched the patients and controls pairwise by sex, age and educational level. Compared to controls, patients generally showed mildly reduced performance across all the measured domains of the neuropsychological test battery. However, only three comparisons reached statistical significance (p < 0.05). Effect sizes (Cohen's d) were generally small to medium. The most pronounced differences between patients and controls were seen on tests of prospective memory, speed and complex working memory. There were no statistical significant associations between self-reported perceived stress level and neuropsychological test performance. In conclusion, we recommend that cognitive functions should be considered when evaluating patients with work-related stress complaints, especially when given advice regarding return to work. Since this study had a cross-sectional design, it is still uncertain whether the impairments are permanent. Further study is required to establish causal links between work-related stress and cognitive deficits.
A Powerful Test for Comparing Multiple Regression Functions.
Maity, Arnab
2012-09-01
In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Cardot, J-M; Roudier, B; Schütz, H
2017-07-01
The f 2 test is generally used for comparing dissolution profiles. In cases of high variability, the f 2 test is not applicable, and the Multivariate Statistical Distance (MSD) test is frequently proposed as an alternative by the FDA and EMA. The guidelines provide only general recommendations. MSD tests can be performed either on raw data with or without time as a variable or on parameters of models. In addition, data can be limited-as in the case of the f 2 test-to dissolutions of up to 85% or to all available data. In the context of the present paper, the recommended calculation included all raw dissolution data up to the first point greater than 85% as a variable-without the various times as parameters. The proposed MSD overcomes several drawbacks found in other methods.
Alemu, Sisay Mulugeta; Habtewold, Tesfa Dejenie; Haile, Yohannes Gebreegziabhere
2017-01-01
Globally 3 to 8% of reproductive age women are suffering from premenstrual dysphoric disorder (PMDD). Several mental and reproductive health-related factors cause low academic achievement during university education. However, limited data exist in Ethiopia. The aim of the study was to investigate mental and reproductive health correlates of academic performance. Institution based cross-sectional study was conducted with 667 Debre Berhan University female students from April to June 2015. Academic performance was the outcome variable. Mental and reproductive health characteristics were explanatory variables. Two-way analysis of variance (ANOVA) test of association was applied to examine group difference in academic performance. Among 529 students who participated, 49.3% reported mild premenstrual syndrome (PMS), 36.9% reported moderate/severe PMS, and 13.8% fulfilled PMDD diagnostic criteria. The ANOVA test of association revealed that there was no significant difference in academic performance between students with different level of PMS experience ( F -statistic = 0.08, p value = 0.93). Nevertheless, there was a significant difference in academic performance between students with different length of menses ( F -statistic = 5.15, p value = 0.006). There was no significant association between PMS experience and academic performance, but on the other hand, the length of menses significantly associated with academic performance.
Effects of Presentation Mode and Computer Familiarity on Summarization of Extended Texts
ERIC Educational Resources Information Center
Yu, Guoxing
2010-01-01
Comparability studies on computer- and paper-based reading tests have focused on short texts and selected-response items via almost exclusively statistical modeling of test performance. The psychological effects of presentation mode and computer familiarity on individual students are under-researched. In this study, 157 students read extended…
Planes, Politics and Oral Proficiency: Testing International Air Traffic Controllers
ERIC Educational Resources Information Center
Moder, Carol Lynn; Halleck, Gene B.
2009-01-01
This study investigates the variation in oral proficiency demonstrated by 14 Air Traffic Controllers across two types of testing tasks: work-related radio telephony-based tasks and non-specific English tasks on aviation topics. Their performance was compared statistically in terms of level ratings on the International Civil Aviation Organization…
The Influence of Ability Grouping on Math Achievement in a Rural Middle School
ERIC Educational Resources Information Center
Pritchard, Robert R.
2012-01-01
The researcher examined the academic performance of low-tracked students (n = 156) using standardized math test scores to determine whether there is a statistically significant difference in achievement depending on academic environment, tracked or nontracked. An analysis of variance (ANOVA) was calculated, using a paired samples t-test for a…
Measurements in quantitative research: how to select and report on research instruments.
Hagan, Teresa L
2014-07-01
Measures exist to numerically represent degrees of attributes. Quantitative research is based on measurement and is conducted in a systematic, controlled manner. These measures enable researchers to perform statistical tests, analyze differences between groups, and determine the effectiveness of treatments. If something is not measurable, it cannot be tested.
Welding of AM350 and AM355 steel
NASA Technical Reports Server (NTRS)
Davis, R. J.; Wroth, R. S.
1967-01-01
A series of tests was conducted to establish optimum procedures for TIG welding and heat treating of AM350 and AM355 steel sheet in thicknesses ranging from 0.010 inch to 0.125 inch. Statistical analysis of the test data was performed to determine the anticipated minimum strength of the welded joints.
Use of the Analysis of the Volatile Faecal Metabolome in Screening for Colorectal Cancer
2015-01-01
Diagnosis of colorectal cancer is an invasive and expensive colonoscopy, which is usually carried out after a positive screening test. Unfortunately, existing screening tests lack specificity and sensitivity, hence many unnecessary colonoscopies are performed. Here we report on a potential new screening test for colorectal cancer based on the analysis of volatile organic compounds (VOCs) in the headspace of faecal samples. Faecal samples were obtained from subjects who had a positive faecal occult blood sample (FOBT). Subjects subsequently had colonoscopies performed to classify them into low risk (non-cancer) and high risk (colorectal cancer) groups. Volatile organic compounds were analysed by selected ion flow tube mass spectrometry (SIFT-MS) and then data were analysed using both univariate and multivariate statistical methods. Ions most likely from hydrogen sulphide, dimethyl sulphide and dimethyl disulphide are statistically significantly higher in samples from high risk rather than low risk subjects. Results using multivariate methods show that the test gives a correct classification of 75% with 78% specificity and 72% sensitivity on FOBT positive samples, offering a potentially effective alternative to FOBT. PMID:26086914
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sen, Satyabrata; Rao, Nageswara S; Wu, Qishi
There have been increasingly large deployments of radiation detection networks that require computationally fast algorithms to produce prompt results over ad-hoc sub-networks of mobile devices, such as smart-phones. These algorithms are in sharp contrast to complex network algorithms that necessitate all measurements to be sent to powerful central servers. In this work, at individual sensors, we employ Wald-statistic based detection algorithms which are computationally very fast, and are implemented as one of three Z-tests and four chi-square tests. At fusion center, we apply the K-out-of-N fusion to combine the sensors hard decisions. We characterize the performance of detection methods bymore » deriving analytical expressions for the distributions of underlying test statistics, and by analyzing the fusion performances in terms of K, N, and the false-alarm rates of individual detectors. We experimentally validate our methods using measurements from indoor and outdoor characterization tests of the Intelligence Radiation Sensors Systems (IRSS) program. In particular, utilizing the outdoor measurements, we construct two important real-life scenarios, boundary surveillance and portal monitoring, and present the results of our algorithms.« less
Siddiqi, Ariba; Arjunan, Sridhar P; Kumar, Dinesh K
2016-08-01
Age-associated changes in the surface electromyogram (sEMG) of Tibialis Anterior (TA) muscle can be attributable to neuromuscular alterations that precede strength loss. We have used our sEMG model of the Tibialis Anterior to interpret the age-related changes and compared with the experimental sEMG. Eighteen young (20-30 years) and 18 older (60-85 years) performed isometric dorsiflexion at 6 different percentage levels of maximum voluntary contractions (MVC), and their sEMG from the TA muscle was recorded. Six different age-related changes in the neuromuscular system were simulated using the sEMG model at the same MVCs as the experiment. The maximal power of the spectrum, Gaussianity and Linearity Test Statistics were computed from the simulated and experimental sEMG. A correlation analysis at α=0.05 was performed between the simulated and experimental age-related change in the sEMG features. The results show the loss in motor units was distinguished by the Gaussianity and Linearity test statistics; while the maximal power of the PSD distinguished between the muscular factors. The simulated condition of 40% loss of motor units with halved the number of fast fibers best correlated with the age-related change observed in the experimental sEMG higher order statistical features. The simulated aging condition found by this study corresponds with the moderate motor unit remodelling and negligible strength loss reported in literature for the cohorts aged 60-70 years.
Modeling the Test-Retest Statistics of a Localization Experiment in the Full Horizontal Plane.
Morsnowski, André; Maune, Steffen
2016-10-01
Two approaches to model the test-retest statistics of a localization experiment basing on Gaussian distribution and on surrogate data are introduced. Their efficiency is investigated using different measures describing directional hearing ability. A localization experiment in the full horizontal plane is a challenging task for hearing impaired patients. In clinical routine, we use this experiment to evaluate the progress of our cochlear implant (CI) recipients. Listening and time effort limit the reproducibility. The localization experiment consists of a 12 loudspeaker circle, placed in an anechoic room, a "camera silens". In darkness, HSM sentences are presented at 65 dB pseudo-erratically from all 12 directions with five repetitions. This experiment is modeled by a set of Gaussian distributions with different standard deviations added to a perfect estimator, as well as by surrogate data. Five repetitions per direction are used to produce surrogate data distributions for the sensation directions. To investigate the statistics, we retrospectively use the data of 33 CI patients with 92 pairs of test-retest-measurements from the same day. The first model does not take inversions into account, (i.e., permutations of the direction from back to front and vice versa are not considered), although they are common for hearing impaired persons particularly in the rear hemisphere. The second model considers these inversions but does not work with all measures. The introduced models successfully describe test-retest statistics of directional hearing. However, since their applications on the investigated measures perform differently no general recommendation can be provided. The presented test-retest statistics enable pair test comparisons for localization experiments.
Ramadas, Gisela C V; Rocha, Ana Maria A C; Fernandes, Edite M G P
2015-01-01
This paper addresses the challenging task of computing multiple roots of a system of nonlinear equations. A repulsion algorithm that invokes the Nelder-Mead (N-M) local search method and uses a penalty-type merit function based on the error function, known as 'erf', is presented. In the N-M algorithm context, different strategies are proposed to enhance the quality of the solutions and improve the overall efficiency. The main goal of this paper is to use a two-level factorial design of experiments to analyze the statistical significance of the observed differences in selected performance criteria produced when testing different strategies in the N-M based repulsion algorithm. The main goal of this paper is to use a two-level factorial design of experiments to analyze the statistical significance of the observed differences in selected performance criteria produced when testing different strategies in the N-M based repulsion algorithm.
Kuo, Yi-Liang; Huang, Kuo-Yuan; Chiang, Pei-Tzu; Lee, Pei-Yun; Tsai, Yi-Ju
2015-01-01
The aims of this study were to compare the steadiness index of spinal regions during single-leg standing in older adults with and without chronic low back pain (LBP) and to correlate measurements of steadiness index with the performance of clinical balance tests. Thirteen community-dwelling older adults (aged 55 years or above) with chronic LBP and 13 age- and gender-matched asymptomatic volunteers participated in this study. Data collection was conducted in a university research laboratory. Measurements were steadiness index of spinal regions (trunk, thoracic spine, lumbar spine, and pelvis) during single-leg standing including relative holding time (RHT) and relative standstill time (RST), and clinical balance tests (timed up and go test and 5-repetition sit to stand test). The LBP group had a statistically significantly smaller RHT than the control group, regardless of one leg stance on the painful or non-painful sides. The RSTs on the painful side leg in the LBP group were not statistically significantly different from the average RSTs of both legs in the control group; however, the RSTs on the non-painful side leg in the LBP group were statistically significantly smaller than those in the control group for the trunk, thoracic spine, and lumbar spine. No statistically significant intra-group differences were found in the RHTs and RSTs between the painful and non-painful side legs in the LBP group. Measurements of clinical balance tests also showed insignificant weak to moderate correlations with steadiness index. In conclusion, older adults with chronic LBP demonstrated decreased spinal steadiness not only in the symptomatic lumbar spine but also in the other spinal regions within the kinetic chain of the spine. When treating older adults with chronic LBP, clinicians may also need to examine their balance performance and spinal steadiness during balance challenging tests. PMID:26024534
Robust Lee local statistic filter for removal of mixed multiplicative and impulse noise
NASA Astrophysics Data System (ADS)
Ponomarenko, Nikolay N.; Lukin, Vladimir V.; Egiazarian, Karen O.; Astola, Jaakko T.
2004-05-01
A robust version of Lee local statistic filter able to effectively suppress the mixed multiplicative and impulse noise in images is proposed. The performance of the proposed modification is studied for a set of test images, several values of multiplicative noise variance, Gaussian and Rayleigh probability density functions of speckle, and different characteris-tics of impulse noise. The advantages of the designed filter in comparison to the conventional Lee local statistic filter and some other filters able to cope with mixed multiplicative+impulse noise are demonstrated.
Study of statistical coding for digital TV
NASA Technical Reports Server (NTRS)
Gardenhire, L. W.
1972-01-01
The results are presented for a detailed study to determine a pseudo-optimum statistical code to be installed in a digital TV demonstration test set. Studies of source encoding were undertaken, using redundancy removal techniques in which the picture is reproduced within a preset tolerance. A method of source encoding, which preliminary studies show to be encouraging, is statistical encoding. A pseudo-optimum code was defined and the associated performance of the code was determined. The format was fixed at 525 lines per frame, 30 frames per second, as per commercial standards.
ERIC Educational Resources Information Center
Seo, Dong Gi; Hao, Shiqi
2016-01-01
Differential item/test functioning (DIF/DTF) are routine procedures to detect item/test unfairness as an explanation for group performance difference. However, unequal sample sizes and small sample sizes have an impact on the statistical power of the DIF/DTF detection procedures. Furthermore, DIF/DTF cannot be used for two test forms without…
The choice of statistical methods for comparisons of dosimetric data in radiotherapy.
Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques
2014-09-18
Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (-5 ± 4.4 SD) for MB and (-4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods. This paper illustrates and justifies the use of statistical tests and graphical representations for dosimetric comparisons in radiotherapy. The statistical analysis shows the significance of dose differences resulting from two or more techniques in radiotherapy.
An Analysis of Rocket Propulsion Testing Costs
NASA Technical Reports Server (NTRS)
Ramirez-Pagan, Carmen P.; Rahman, Shamim A.
2009-01-01
The primary mission at NASA Stennis Space Center (SSC) is rocket propulsion testing. Such testing is generally performed within two arenas: (1) Production testing for certification and acceptance, and (2) Developmental testing for prototype or experimental purposes. The customer base consists of NASA programs, DOD programs, and commercial programs. Resources in place to perform on-site testing include both civil servants and contractor personnel, hardware and software including data acquisition and control, and 6 test stands with a total of 14 test positions/cells. For several business reasons there is the need to augment understanding of the test costs for all the various types of test campaigns. Historical propulsion test data was evaluated and analyzed in many different ways with the intent to find any correlation or statistics that could help produce more reliable and accurate cost estimates and projections. The analytical efforts included timeline trends, statistical curve fitting, average cost per test, cost per test second, test cost timeline, and test cost envelopes. Further, the analytical effort includes examining the test cost from the perspective of thrust level and test article characteristics. Some of the analytical approaches did not produce evidence strong enough for further analysis. Some other analytical approaches yield promising results and are candidates for further development and focused study. Information was organized for into its elements: a Project Profile, Test Cost Timeline, and Cost Envelope. The Project Profile is a snap shot of the project life cycle on a timeline fashion, which includes various statistical analyses. The Test Cost Timeline shows the cumulative average test cost, for each project, at each month where there was test activity. The Test Cost Envelope shows a range of cost for a given number of test(s). The supporting information upon which this study was performed came from diverse sources and thus it was necessary to build several intermediate databases in order to understand, validate, and manipulate data. These intermediate databases (validated historical account of schedule, test activity, and cost) by themselves are of great value and utility. For example, for the Project Profile, we were able to merged schedule, cost, and test activity. This kind of historical account conveys important information about sequence of events, lead time, and opportunities for improvement in future propulsion test projects. The Product Requirement Document (PRD) file is a collection of data extracted from each project PRD (technical characteristics, test requirements, and projection of cost, schedule, and test activity). This information could help expedite the development of future PRD (or equivalent document) on similar projects, and could also, when compared to the actual results, help improve projections around cost and schedule. Also, this file can be sorted by the parameter of interest to perform a visual review of potential common themes or trends. The process of searching, collecting, and validating propulsion test data encountered a lot of difficulties which then led to a set of recommendations for improvement in order to facilitate future data gathering and analysis.
Poudel, Sashi; Weir, Lori; Dowling, Dawn; Medich, David C
2016-08-01
A statistical pilot study was retrospectively performed to analyze potential changes in occupational radiation exposures to Interventional Radiology (IR) staff at Lawrence General Hospital after implementation of the i2 Active Radiation Dosimetry System (Unfors RaySafe Inc, 6045 Cochran Road Cleveland, OH 44139-3302). In this study, the monthly OSL dosimetry records obtained during the eight-month period prior to i2 implementation were normalized to the number of procedures performed during each month and statistically compared to the normalized dosimetry records obtained for the 8-mo period after i2 implementation. The resulting statistics included calculation of the mean and standard deviation of the dose equivalences per procedure and included appropriate hypothesis tests to assess for statistically valid differences between the pre and post i2 study periods. Hypothesis testing was performed on three groups of staff present during an IR procedure: The first group included all members of the IR staff, the second group consisted of the IR radiologists, and the third group consisted of the IR technician staff. After implementing the i2 active dosimetry system, participating members of the Lawrence General IR staff had a reduction in the average dose equivalence per procedure of 43.1% ± 16.7% (p = 0.04). Similarly, Lawrence General IR radiologists had a 65.8% ± 33.6% (p=0.01) reduction while the technologists had a 45.0% ± 14.4% (p=0.03) reduction.
Staging Liver Fibrosis with Statistical Observers
NASA Astrophysics Data System (ADS)
Brand, Jonathan Frieman
Chronic liver disease is a worldwide health problem, and hepatic fibrosis (HF) is one of the hallmarks of the disease. Pathology diagnosis of HF is based on textural change in the liver as a lobular collagen network that develops within portal triads. The scale of collagen lobules is characteristically on order of 1mm, which close to the resolution limit of in vivo Gd-enhanced MRI. In this work the methods to collect training and testing images for a Hotelling observer are covered. An observer based on local texture analysis is trained and tested using wet-tissue phantoms. The technique is used to optimize the MRI sequence based on task performance. The final method developed is a two stage model observer to classify fibrotic and healthy tissue in both phantoms and in vivo MRI images. The first stage observer tests for the presence of local texture. Test statistics from the first observer are used to train the second stage observer to globally sample the local observer results. A decision of the disease class is made for an entire MRI image slice using test statistics collected from the second observer. The techniques are tested on wet-tissue phantoms and in vivo clinical patient data.
Review of "Cross-Country Evidence on Teacher Performance Pay"
ERIC Educational Resources Information Center
von Davier, Matthias
2011-01-01
The primary claim of this Harvard Program on Education Policy and Governance report and the abridged Education Next version is that nations "that pay teachers on their performance score higher on PISA tests." After statistically controlling for several variables, the author concludes that nations with some form of merit pay system have,…
ERIC Educational Resources Information Center
Wainer, Howard
2000-01-01
Discusses three interlocking areas associated with effectively and accurately conveying information about school performance to the public: (1) graphical display; (2) nonrandomly gathered data; and (3) statistical adjustment. Illustrates these points with historical data, including test results from the National Assessment of Educational Progress…
Federal Register 2010, 2011, 2012, 2013, 2014
2010-08-06
... considerations affecting the design and conduct of repellent studies when human subjects are involved. Any... recommendations for the design and execution of studies to evaluate the performance of pesticide products intended... recommends appropriate study designs and methods for selecting subjects, statistical analysis, and reporting...
Training, Innovation and Business Performance: An Analysis of the Business Longitudinal Survey.
ERIC Educational Resources Information Center
Dockery, A. Michael
This paper uses the Australian Bureau of Statistics' Business Longitudinal Survey to explore relationships between training, innovation, and firm performance for Australian businesses with less than 200 employees. The longitudinal nature of the data is used to test various hypotheses about the nature of the link between training, business changes,…
Definition of simulated driving tests for the evaluation of drivers' reactions and responses.
Bartolozzi, Riccardo; Frendo, Francesco
2014-01-01
This article aims at identifying the most significant measures in 2 perception-response (PR) tests performed at a driving simulator: a braking test and a lateral skid test, which were developed in this work. Forty-eight subjects (26 females and 22 males) with a mean age of 24.9 ± 3.0 years were enrolled for this study. They were asked to perform a drive on the driving simulator at the University of Pisa (Italy) following a specific test protocol, including 8-10 braking tests and 8-10 lateral skid tests. Driver input signals and vehicle model signals were recorded during the drives and analyzed to extract measures such as the reaction time, first response time, etc. Following a statistical procedure (based on analysis of variance [ANOVA] and post hoc tests), all test measures (3 for the braking test and 8 for the lateral skid test) were analyzed in terms of statistically significant differences among different drivers. The presented procedure allows evaluation of the capability of a given test to distinguish among different drivers. In the braking test, the reaction time showed a high dispersion among single drivers, leading to just 4.8 percent of statistically significant driver pairs (using the Games-Howell post hoc test), whereas the pedal transition time scored 31.9 percent. In the lateral skid test, 28.5 percent of the 2 × 2 comparisons showed significantly different reaction times, 19.5 percent had different response times, 35.2 percent had a different second peak of the steering wheel signal, and 33 percent showed different values of the integral of the steering wheel signal. For the braking test, which has been widely employed in similar forms in the literature, it was shown how the reaction time, with respect to the pedal transition time, can have a higher dispersion due to the influence of external factors. For the lateral skid test, the following measures were identified as the most significant for application studies: the reaction time for the reaction phase, the second peak of the steering wheel angle for the first instinctive response, and the integral of the steering wheel angle for the complete response. The methodology used to analyze the test measures was founded on statistically based and objective evaluation criteria and could be applied to other tests. Even if obtained with a fixed-base simulator, the obtained results represent useful information for applications of the presented PR tests in experimental campaigns with driving simulators.
Hagau, Natalia; Gherman, Nadia; Cocis, Mihaela; Petrisor, Cristina
2016-01-01
Skin tests for neuromuscular blocking agents (NMBAs) are not currently recommended for the general population undergoing general anaesthesia. In a previous study we have reported a high incidence of positive allergy tests for NMBAs in patients with a positive history of non-anaesthetic drug allergy, a larger prospective study being needed to confirm those preliminary results. The objective of this study was to compare the skin tests results for patients with a positive history of antibiotic-induced immediate type hypersensitivity reactions to those of controls without drug allergies. Ninety eight patients with previous antibiotic hypersensitivity and 72 controls were prospectively included. Skin tests were performed for atracurium, pancuronium, rocuronium, and suxamethonium. We found 65 positive skin tests from the 392 tests performed in patients with a positive history of antibiotic hypersensitivity (1 6.58%) and 23 positive skin tests from the 288 performed in controls (7.98%), the two incidences showing significant statistical difference (p = 0.0011). The relative risk for having a positive skin test for NMBAs for patients versus controls was 1.77 (1.15-2.76). For atracurium, skin tests were more often positive in patients with a positive history of antibiotic hypersensitivity versus controls (p = 0.02). For pancuronium, rocuronium and suxamethonium the statistical difference was not attained (p-values 0.08 for pancuronium, 0.23 for rocuronium, and 0.26 for suxamethonium). Patients with a positive history of antibiotic hypersensitivity seem to have a higher incidence of positive skin tests for NMBAs. They might represent a group at higher risk for developing intraoperative anaphylaxis compared to the general population. Copyright © 2015 The Authors. Production and hosting by Elsevier B.V. All rights reserved.
Jordan, Denis; Steiner, Marcel; Kochs, Eberhard F; Schneider, Gerhard
2010-12-01
Prediction probability (P(K)) and the area under the receiver operating characteristic curve (AUC) are statistical measures to assess the performance of anesthetic depth indicators, to more precisely quantify the correlation between observed anesthetic depth and corresponding values of a monitor or indicator. In contrast to many other statistical tests, they offer several advantages. First, P(K) and AUC are independent from scale units and assumptions on underlying distributions. Second, the calculation can be performed without any knowledge about particular indicator threshold values, which makes the test more independent from specific test data. Third, recent approaches using resampling methods allow a reliable comparison of P(K) or AUC of different indicators of anesthetic depth. Furthermore, both tests allow simple interpretation, whereby results between 0 and 1 are related to the probability, how good an indicator separates the observed levels of anesthesia. For these reasons, P(K) and AUC have become popular in medical decision making. P(K) is intended for polytomous patient states (i.e., >2 anesthetic levels) and can be considered as a generalization of the AUC, which was basically introduced to assess a predictor of dichotomous classes (e.g., consciousness and unconsciousness in anesthesia). Dichotomous paradigms provide equal values of P(K) and AUC test statistics. In the present investigation, we introduce a user-friendly computer program for computing P(K) and estimating reliable bootstrap confidence intervals. It is designed for multiple comparisons of the performance of depth of anesthesia indicators. Additionally, for dichotomous classes, the program plots the receiver operating characteristic graph completing information obtained from P(K) or AUC, respectively. In clinical investigations, both measures are applied for indicator assessment, where ambiguous usage and interpretation may be a consequence. Therefore, a summary of the concepts of P(K) and AUC including brief and easily understandable proof of their equality is presented in the text. The exposure introduces readers to the algorithms of the provided computer program and is intended to make standardized performance tests of depth of anesthesia indicators available to medical researchers.
NASA Astrophysics Data System (ADS)
Deshayes, Yannick; Verdier, Frederic; Bechou, Laurent; Tregon, Bernard; Danto, Yves; Laffitte, Dominique; Goudard, Jean Luc
2004-09-01
High performance and high reliability are two of the most important goals driving the penetration of optical transmission into telecommunication systems ranging from 880 nm to 1550 nm. Lifetime prediction defined as the time at which a parameter reaches its maximum acceptable shirt still stays the main result in terms of reliability estimation for a technology. For optoelectronic emissive components, selection tests and life testing are specifically used for reliability evaluation according to Telcordia GR-468 CORE requirements. This approach is based on extrapolation of degradation laws, based on physics of failure and electrical or optical parameters, allowing both strong test time reduction and long-term reliability prediction. Unfortunately, in the case of mature technology, there is a growing complexity to calculate average lifetime and failure rates (FITs) using ageing tests in particular due to extremely low failure rates. For present laser diode technologies, time to failure tend to be 106 hours aged under typical conditions (Popt=10 mW and T=80°C). These ageing tests must be performed on more than 100 components aged during 10000 hours mixing different temperatures and drive current conditions conducting to acceleration factors above 300-400. These conditions are high-cost, time consuming and cannot give a complete distribution of times to failure. A new approach consists in use statistic computations to extrapolate lifetime distribution and failure rates in operating conditions from physical parameters of experimental degradation laws. In this paper, Distributed Feedback single mode laser diodes (DFB-LD) used for 1550 nm telecommunication network working at 2.5 Gbit/s transfer rate are studied. Electrical and optical parameters have been measured before and after ageing tests, performed at constant current, according to Telcordia GR-468 requirements. Cumulative failure rates and lifetime distributions are computed using statistic calculations and equations of drift mechanisms versus time fitted from experimental measurements.
Koksal, Ayhan; Keskinkılıc, Cahit; Sozmen, Mehmet Vedat; Dirican, Ayten Ceyhan; Aysal, Fikret; Altunkaynak, Yavuz; Baybas, Sevim
2014-01-01
In this study, cognitive functions of 9 patients developing parkinsonism due to chronic manganese intoxication by intravenous methcathinone solution were investigated using detailed neuropsychometric tests. Attention deficit, verbal and nonverbal memory, visuospatial function, constructive ability, language, and executive (frontal) functions of 9 patients who were admitted to our clinic with manifestations of chronic manganese intoxication and 9 control subjects were assessed using neuropsychometric tests. Two years later, detailed repeat neuropsychometric tests were performed in the patient group. The results were evaluated using the χ(2) test, Fisher's exact probability test, Student's t test and the Mann-Whitney U test. While there was no statistically significant difference between the two groups in language functions, visuospatial functions and constructive ability, a statistically significant difference was noted between both groups regarding attention (p = 0.032), calculation (p = 0.004), recall and recognition domains of verbal memory, nonverbal memory (p = 0.021) and some domains of frontal functions (Stroop-5 and spontaneous recovery) (p = 0.022 and 0.012). Repeat neuropsychometric test results of the patients were not statistically significant 2 years later. It has been observed that cognitive dysfunction seen in parkinsonism secondary to chronic manganese intoxication may be long-lasting and may not recover as observed in motor dysfunction. © 2014 S. Karger AG, Basel.
Kim, Jin Chul; Chon, Jinmann; Kim, Hee Sang; Lee, Jong Ha; Yoo, Seung Don; Kim, Dong Hwan; Lee, Seung Ah; Han, Yoo Jin; Lee, Hyun Seok; Lee, Bae Youl; Soh, Yun Soo; Won, Chang Won
2017-04-01
To evaluate the association between baseline characteristics, three physical performance tests and fall history in a sample of the elderly from Korean population. A total of 307 participants (mean age, 76.70±4.85 years) were categorized into one of two groups, i.e., fallers and non-fallers. Fifty-two participants who had reported falling unexpectedly at least once in the previous 12 months were assigned to the fallers group. Physical performance tests included Short Physical Performance Battery (SPPB), Berg Balance Scale (BBS), Timed Up and Go test. The differences between the two study groups were compared and we analyzed the correlations between fall histories and physical performance tests. SPPB demonstrated a significant association with fall history. Although the BBS total scores did not show statistical significance, two dynamic balance test items of BBS (B12 and B13) showed a significant association among fallers. This study suggests that SPPB and two dynamic balance test items of the BBS can be used in screening for risk of falls in an ambulatory elderly population.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
The influence of control group reproduction on the statistical ...
Because of various Congressional mandates to protect the environment from endocrine disrupting chemicals (EDCs), the United States Environmental Protection Agency (USEPA) initiated the Endocrine Disruptor Screening Program. In the context of this framework, the Office of Research and Development within the USEPA developed the Medaka Extended One Generation Reproduction Test (MEOGRT) to characterize the endocrine action of a suspected EDC. One important endpoint of the MEOGRT is fecundity of breeding pairs of medaka. Power analyses were conducted to determine the number of replicates needed in proposed test designs and to determine the effects that varying reproductive parameters (e.g. mean fecundity, variance, and days with no egg production) will have on the statistical power of the test. A software tool, the MEOGRT Reproduction Power Analysis Tool, was developed to expedite these power analyses by both calculating estimates of the needed reproductive parameters (e.g. population mean and variance) and performing the power analysis under user specified scenarios. The manuscript illustrates how the reproductive performance of the control medaka that are used in a MEOGRT influence statistical power, and therefore the successful implementation of the protocol. Example scenarios, based upon medaka reproduction data collected at MED, are discussed that bolster the recommendation that facilities planning to implement the MEOGRT should have a culture of medaka with hi
Does rational selection of training and test sets improve the outcome of QSAR modeling?
Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander
2012-10-22
Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo
2007-01-01
Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Use of power analysis to develop detectable significance criteria for sea urchin toxicity tests
Carr, R.S.; Biedenbach, J.M.
1999-01-01
When sufficient data are available, the statistical power of a test can be determined using power analysis procedures. The term “detectable significance” has been coined to refer to this criterion based on power analysis and past performance of a test. This power analysis procedure has been performed with sea urchin (Arbacia punctulata) fertilization and embryological development data from sediment porewater toxicity tests. Data from 3100 and 2295 tests for the fertilization and embryological development tests, respectively, were used to calculate the criteria and regression equations describing the power curves. Using Dunnett's test, a minimum significant difference (MSD) (β = 0.05) of 15.5% and 19% for the fertilization test, and 16.4% and 20.6% for the embryological development test, for α ≤ 0.05 and α ≤ 0.01, respectively, were determined. The use of this second criterion reduces type I (false positive) errors and helps to establish a critical level of difference based on the past performance of the test.
EFFECTS OF TWO TYPES OF TRUNK EXERCISES ON BALANCE AND ATHLETIC PERFORMANCE IN YOUTH SOCCER PLAYERS
Kaneoka, Koji; Okubo, Yu; Shiraki, Hitoshi
2014-01-01
Purpose/Background: Many athletes perform trunk stabilization exercises (SE) and conventional trunk exercises (CE) to enhance trunk stability and strength. However, evidence regarding the specific training effects of SE and CE is lacking and there have been no studies for youth athletes. Therefore, the purpose of this study was to investigate the training effects of SE and CE on balance and athletic performance in youth soccer players. Methods: Twenty‐seven male youth soccer players were assigned randomly to either an SE group (n = 13) or CE group (n = 14). Data from nineteen players who completed all training sessions were used for statistical analyses (SE, n = 10; CE, n = 9). Before and after the 12‐week intervention program, pre‐ and post‐testing comprised of a static balance test, Star Excursion Balance Test (SEBT), Cooper’s test, sprint, the Step 50, vertical jump, and rebound jump were performed. After pre‐testing, players performed the SE or CE program three times per week for 12 weeks. A two‐way repeated‐measures ANOVA was used to assess the changes over time, and differences between the groups. Within‐group changes from pre‐testing to post‐testing were determined using paired t‐tests. Statistical significance was inferred from p < 0.05. Results: There were significant group‐by‐time interactions for posterolateral (p = 0.022) and posteromedial (p < 0.001) directions of the SEBT. Paired t‐tests revealed significant improvements of the posterolateral and posteromedial directions in the SE group. Although other measurements did not find group‐by‐time interactions, within‐group changes were detected indicating significant improvements in the static balance test, Cooper’s test, and rebound jump in the only SE group (p < 0.05). Vertical jump and sprint were improved significantly in both groups (p < 0.05), but the Step 50 was not improved in either group (p > 0.05). Conclusions: Results suggested that the SE has specific training effects that enhance static and dynamic balance, Cooper’s test, and rebound jump. Level of Evidence: 3b PMID:24567855
Database Performance Monitoring for the Photovoltaic Systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klise, Katherine A.
The Database Performance Monitoring (DPM) software (copyright in processes) is being developed at Sandia National Laboratories to perform quality control analysis on time series data. The software loads time indexed databases (currently csv format), performs a series of quality control tests defined by the user, and creates reports which include summary statistics, tables, and graphics. DPM can be setup to run on an automated schedule defined by the user. For example, the software can be run once per day to analyze data collected on the previous day. HTML formatted reports can be sent via email or hosted on a website.more » To compare performance of several databases, summary statistics and graphics can be gathered in a dashboard view which links to detailed reporting information for each database. The software can be customized for specific applications.« less
Visual Sample Plan Version 7.0 User's Guide
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matzke, Brett D.; Newburn, Lisa LN; Hathaway, John E.
2014-03-01
User's guide for VSP 7.0 This user's guide describes Visual Sample Plan (VSP) Version 7.0 and provides instructions for using the software. VSP selects the appropriate number and location of environmental samples to ensure that the results of statistical tests performed to provide input to risk decisions have the required confidence and performance. VSP Version 7.0 provides sample-size equations or algorithms needed by specific statistical tests appropriate for specific environmental sampling objectives. It also provides data quality assessment and statistical analysis functions to support evaluation of the data and determine whether the data support decisions regarding sites suspected of contamination.more » The easy-to-use program is highly visual and graphic. VSP runs on personal computers with Microsoft Windows operating systems (XP, Vista, Windows 7, and Windows 8). Designed primarily for project managers and users without expertise in statistics, VSP is applicable to two- and three-dimensional populations to be sampled (e.g., rooms and buildings, surface soil, a defined layer of subsurface soil, water bodies, and other similar applications) for studies of environmental quality. VSP is also applicable for designing sampling plans for assessing chem/rad/bio threat and hazard identification within rooms and buildings, and for designing geophysical surveys for unexploded ordnance (UXO) identification.« less
GPUs for statistical data analysis in HEP: a performance study of GooFit on GPUs vs. RooFit on CPUs
NASA Astrophysics Data System (ADS)
Pompili, Alexis; Di Florio, Adriano; CMS Collaboration
2016-10-01
In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the Jψϕ invariant mass in the three-body decay B +→JψϕK +. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerably resulting speed-up, while comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may apply or does not apply because its regularity conditions are not satisfied.
NASA Astrophysics Data System (ADS)
Di Florio, Adriano
2017-10-01
In order to test the computing capabilities of GPUs with respect to traditional CPU cores a high-statistics toy Monte Carlo technique has been implemented both in ROOT/RooFit and GooFit frameworks with the purpose to estimate the statistical significance of the structure observed by CMS close to the kinematical boundary of the J/ψϕ invariant mass in the three-body decay B + → J/ψϕK +. GooFit is a data analysis open tool under development that interfaces ROOT/RooFit to CUDA platform on nVidia GPU. The optimized GooFit application running on GPUs hosted by servers in the Bari Tier2 provides striking speed-up performances with respect to the RooFit application parallelised on multiple CPUs by means of PROOF-Lite tool. The considerable resulting speed-up, evident when comparing concurrent GooFit processes allowed by CUDA Multi Process Service and a RooFit/PROOF-Lite process with multiple CPU workers, is presented and discussed in detail. By means of GooFit it has also been possible to explore the behaviour of a likelihood ratio test statistic in different situations in which the Wilks Theorem may or may not apply because its regularity conditions are not satisfied.
NASA Astrophysics Data System (ADS)
Wootten, A.; Dixon, K. W.; Lanzante, J. R.; Mcpherson, R. A.
2017-12-01
Empirical statistical downscaling (ESD) approaches attempt to refine global climate model (GCM) information via statistical relationships between observations and GCM simulations. The aim of such downscaling efforts is to create added-value climate projections by adding finer spatial detail and reducing biases. The results of statistical downscaling exercises are often used in impact assessments under the assumption that past performance provides an indicator of future results. Given prior research describing the danger of this assumption with regards to temperature, this study expands the perfect model experimental design from previous case studies to test the stationarity assumption with respect to precipitation. Assuming stationarity implies the performance of ESD methods are similar between the future projections and historical training. Case study results from four quantile-mapping based ESD methods demonstrate violations of the stationarity assumption for both central tendency and extremes of precipitation. These violations vary geographically and seasonally. For the four ESD methods tested the greatest challenges for downscaling of daily total precipitation projections occur in regions with limited precipitation and for extremes of precipitation along Southeast coastal regions. We conclude with a discussion of future expansion of the perfect model experimental design and the implications for improving ESD methods and providing guidance on the use of ESD techniques for impact assessments and decision-support.
[Psychological results of mental performance in sleep deprivation].
Dahms, P; Schaad, G; Gorges, W; von Restorff, W
1996-01-01
To quantify the effects of sleep periods that have different lengths of time during continuous operations (CONOPS) 2 independent groups of subjects performed several cognitive tasks for 3 days. The 72 h trial period contained three 60-min sleep periods for the 10 subjects of the experimental group and three sleep periods of 4 h each for the 14 subjects of the control group. With the exception of only one subtest the statistical analyses of the test results of the 2 groups show no significant differences in cognitive performance. It is suggested that high motivation is responsible for comparable performance of the subjects, which was essentially obtained by a monetary pay system for successful test performance.
NASA Astrophysics Data System (ADS)
Ranaie, Mehrdad; Soffianian, Alireza; Pourmanafi, Saeid; Mirghaffari, Noorollah; Tarkesh, Mostafa
2018-03-01
In recent decade, analyzing the remotely sensed imagery is considered as one of the most common and widely used procedures in the environmental studies. In this case, supervised image classification techniques play a central role. Hence, taking a high resolution Worldview-3 over a mixed urbanized landscape in Iran, three less applied image classification methods including Bagged CART, Stochastic gradient boosting model and Neural network with feature extraction were tested and compared with two prevalent methods: random forest and support vector machine with linear kernel. To do so, each method was run ten time and three validation techniques was used to estimate the accuracy statistics consist of cross validation, independent validation and validation with total of train data. Moreover, using ANOVA and Tukey test, statistical difference significance between the classification methods was significantly surveyed. In general, the results showed that random forest with marginal difference compared to Bagged CART and stochastic gradient boosting model is the best performing method whilst based on independent validation there was no significant difference between the performances of classification methods. It should be finally noted that neural network with feature extraction and linear support vector machine had better processing speed than other.
Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
Xia, Xiao-Lei; Xing, Huanlai; Liu, Xueqin
2013-01-01
One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing -like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. PMID:24349110
Santanelli di Pompeo, Fabio; Sorotos, Michail; Laporta, Rosaria; Pagnoni, Marco; Longo, Benedetto
2018-02-01
Excellent cosmetic results from skin-sparing mastectomy (SSM) are often impaired by skin flaps' necrosis (SFN), from 8%-25% or worse in smokers. This study prospectively investigated the efficacy of Double-Mirrored Omega Pattern (DMOP-SSM) compared to Wise Pattern SSM (WP-SSM) for immediate reconstruction in moderate/large-breasted smokers. From 2008-2010, DMOP-SSM was performed in 51 consecutive immediate breast reconstructions on 41 smokers (mean age = 49.8 years) with moderate/large and ptotic breasts. This active group (AG) was compared to a similar historical control group (CG) of 37 smokers (mean age = 51.1 years) who underwent WP-SSM and immediate breast reconstruction, with a mean follow-up of 37.6 months. Skin ischaemic complications, number of surgical revisions, time to wound healing, and patient satisfaction were analysed. Descriptive statistics were reported and comparison of performance endpoints was performed using Fisher's exact test and Mann-Whitney U-test. A p-value <.05 was considered significant. Patients' mean age (p = .316) and BMI (p = .215) were not statistically different between groups. Ischaemic complications occurred in 11.7% of DMOP-SSMs and in 32.4% of WP-SSMs (p = .017), and revision rates were, respectively, 5.8% and 24.3% (p = .012), both statistically significant. Mean time to wound healing was, respectively, 16.8 days and 18.4 days (p = .205). Mean patients' satisfaction scores were, respectively, 18.9 and 21.1, statistically significant (p = .022). Although tobacco use in moderate/large breasted patients can severely impair outcomes of breast reconstruction, the DMOP-SSM approach, compared to WP-SSM, allows smokers to benefit from SSM, but with statistically significant reduced skin flaps ischaemic complications, revision surgery, and better cosmetic outcomes.
Marciano, Marina Angélica; Estrela, Carlos; Mondelli, Rafael Francisco Lia; Ordinola-Zapata, Ronald; Duarte, Marco Antonio Hungaro
2013-01-01
The aim of the study was to determine if the increase in radiopacity provided by bismuth oxide is related to the color alteration of calcium silicate-based cement. Calcium silicate cement (CSC) was mixed with 0%, 15%, 20%, 30% and 50% of bismuth oxide (BO), determined by weight. Mineral trioxide aggregate (MTA) was the control group. The radiopacity test was performed according to ISO 6876/2001. The color was evaluated using the CIE system. The assessments were performed after 24 hours, 7 and 30 days of setting time, using a spectrophotometer to obtain the ΔE, Δa, Δb and ΔL values. The statistical analyses were performed using the Kruskal-Wallis/Dunn and ANOVA/Tukey tests (p<0.05). The cements in which bismuth oxide was added showed radiopacity corresponding to the ISO recommendations (>3 mm equivalent of Al). The MTA group was statistically similar to the CSC/30% BO group (p>0.05). In regard to color, the increase of bismuth oxide resulted in a decrease in the ΔE value of the calcium silicate cement. The CSC group presented statistically higher ΔE values than the CSC/50% BO group (p<0.05). The comparison between 24 hours and 7 days showed higher ΔE for the MTA group, with statistical differences for the CSC/15% BO and CSC/50% BO groups (p<0.05). After 30 days, CSC showed statistically higher ΔE values than CSC/30% BO and CSC/50% BO (p<0.05). In conclusion, the increase in radiopacity provided by bismuth oxide has no relation to the color alteration of calcium silicate-based cements.
Allele-sharing models: LOD scores and accurate linkage tests.
Kong, A; Cox, N J
1997-11-01
Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested.
Allele-sharing models: LOD scores and accurate linkage tests.
Kong, A; Cox, N J
1997-01-01
Starting with a test statistic for linkage analysis based on allele sharing, we propose an associated one-parameter model. Under general missing-data patterns, this model allows exact calculation of likelihood ratios and LOD scores and has been implemented by a simple modification of existing software. Most important, accurate linkage tests can be performed. Using an example, we show that some previously suggested approaches to handling less than perfectly informative data can be unacceptably conservative. Situations in which this model may not perform well are discussed, and an alternative model that requires additional computations is suggested. PMID:9345087
Visualizing the Bayesian 2-test case: The effect of tree diagrams on medical decision making.
Binder, Karin; Krauss, Stefan; Bruckmaier, Georg; Marienhagen, Jörg
2018-01-01
In medicine, diagnoses based on medical test results are probabilistic by nature. Unfortunately, cognitive illusions regarding the statistical meaning of test results are well documented among patients, medical students, and even physicians. There are two effective strategies that can foster insight into what is known as Bayesian reasoning situations: (1) translating the statistical information on the prevalence of a disease and the sensitivity and the false-alarm rate of a specific test for that disease from probabilities into natural frequencies, and (2) illustrating the statistical information with tree diagrams, for instance, or with other pictorial representation. So far, such strategies have only been empirically tested in combination for "1-test cases", where one binary hypothesis ("disease" vs. "no disease") has to be diagnosed based on one binary test result ("positive" vs. "negative"). However, in reality, often more than one medical test is conducted to derive a diagnosis. In two studies, we examined a total of 388 medical students from the University of Regensburg (Germany) with medical "2-test scenarios". Each student had to work on two problems: diagnosing breast cancer with mammography and sonography test results, and diagnosing HIV infection with the ELISA and Western Blot tests. In Study 1 (N = 190 participants), we systematically varied the presentation of statistical information ("only textual information" vs. "only tree diagram" vs. "text and tree diagram in combination"), whereas in Study 2 (N = 198 participants), we varied the kinds of tree diagrams ("complete tree" vs. "highlighted tree" vs. "pruned tree"). All versions were implemented in probability format (including probability trees) and in natural frequency format (including frequency trees). We found that natural frequency trees, especially when the question-related branches were highlighted, improved performance, but that none of the corresponding probabilistic visualizations did.
Shuttle payload vibroacoustic test plan evaluation
NASA Technical Reports Server (NTRS)
Stahle, C. V.; Gongloff, H. R.; Young, J. P.; Keegan, W. B.
1977-01-01
Statistical decision theory is used to evaluate seven alternate vibro-acoustic test plans for Space Shuttle payloads; test plans include component, subassembly and payload testing and combinations of component and assembly testing. The optimum test levels and the expected cost are determined for each test plan. By including all of the direct cost associated with each test plan and the probabilistic costs due to ground test and flight failures, the test plans which minimize project cost are determined. The lowest cost approach eliminates component testing and maintains flight vibration reliability by performing subassembly tests at a relatively high acoustic level.
Ohira, Masayuki; Silcox, Jade; Haygood, Deavin; Harper-King, Valerie; Alsharabati, Mohammad; Lu, Liang; Morgan, Marla B; Young, Angela M; Claussen, Gwen C; King, Peter H; Oh, Shin J
2013-01-01
We compared the problems or complications associated with electrodiagnostic testing in 77 patients with implanted cardiac devices. Thirty tests were performed after magnet placement, and 47 were performed without magnet application. All electrodiagnostic tests were performed safely in all patients without any serious effect on the implanted cardiac devices with or without magnet placement. A significantly higher number of patient symptoms and procedure changes were reported in the magnet group (P < 0.013). No statistical difference was found in the testing difficulty or ECG changes. The magnet group patients had an approximately 11-fold greater risk of symptoms than those in the control group. Our data do not support a recommendation that magnet placement is necessary for routine electrodiagnostic testing in patients with implanted cardiac devices, as long as our general and specific guidelines are followed. Copyright © 2012 Wiley Periodicals, Inc.
Bickley, S R; Shaw, L; Shaw, M J
1990-01-01
A study was undertaken to test the hypothesis that clinical photographic records can be used to motivate oral hygiene performance in mentally handicapped adults. Plaque reduction over the period of the study was shown to be higher in the test group than the control group but differences between the test and control groups were not statistically significant. The small numbers involved (29) and the difficulties in matching subjects may have mitigated against demonstrating a statistically significant difference between the two groups. All participants demonstrated high levels of toothbrushing ability during the practical aspects of the study but this was not maintained through daily oral hygiene practices in the majority of subjects.
Postural Stability of Special Warfare Combatant-Craft Crewmen With Tactical Gear.
Morgan, Paul M; Williams, Valerie J; Sell, Timothy C
The US Naval Special Warfare's Special Warfare Combatant-Craft Crewmen (SWCC) operate on small, high-speed boats while wearing tactical gear (TG). The TG increases mission safety and success but may affect postural stability, potentially increasing risk for musculoskeletal injury. Therefore, the purpose of this study was to examine the effects of TG on postural stability during the Sensory Organization Test (SOT). Eight SWCC performed the SOT on NeuroCom's Balance Manager with TG and with no tactical gear (NTG). The status of gear was performed in randomized order. The SOT consisted of six different conditions that challenge sensory systems responsible for postural stability. Each condition was performed for three trials, resulting in a total of 18 trials. Overall performance, each individual condition, and sensory system analysis (somatosensory, visual, vestibular, preference) were scored. Data were not normally distributed therefore Wilcoxon signed-rank tests were used to compare each variable (ρ = .05). No significant differences were found between NTG and TG tests. No statistically significant differences were detected under the two TG conditions. This may be due to low statistical power, or potentially insensitivity of the assessment. Also, the amount and distribution of weight worn during the TG conditions, and the SWCC's unstable occupational platform, may have contributed to the findings. The data from this sample will be used in future research to better understand how TG affects SWCC. The data show that the addition of TG used in our study did not affect postural stability of SWCC during the SOT. Although no statistically significant differences were observed, there are clinical reasons for continued study of the effect of increased load on postural stability, using more challenging conditions, greater surface perturbations, dynamic tasks, and heavier loads. 2016.
Development of polytoxicomania in function of defence from psychoticism.
Nenadović, Milutin M; Sapić, Rosa
2011-01-01
Polytoxicomanic proportions in subpopulations of youth have been growing steadily in recent decades, and this trend is pan-continental. Psychoticism is a psychological construct that assumes special basic dimensions of personality disintegration and cognitive functions. Psychoticism may, in general, be the basis of pathological functioning of youth and influence the patterns of thought, feelings and actions that cause dysfunction. The aim of this study was to determine the distribution of basic dimensions of psychoticism for commitment of youth to abuse psychoactive substances (PAS) in order to reduce disturbing intrapsychic experiences or manifestation of psychotic symptoms. For the purpose of this study, two groups of respondents were formed, balanced by age, gender and family structure of origin (at least one parent alive). The study applied a DELTA-9 instrument for assessment of cognitive disintegration in function of establishing psychoticism and its operationalization. The obtained results were statistically analyzed. From the parameters of descriptive statistics, the arithmetic mean was calculated with measures of dispersion. A cross-tabular analysis of variables tested was performed, as well as statistical significance with Pearson's chi2-test, and analysis of variance. Age structure and gender are approximately represented in the group of polytoximaniacs and the control group. Testing did not confirm the statistically significant difference (p > 0.5). Statistical methodology established that they significantly differed in most variables of psychoticism, polytoxicomaniacs compared with a control group of respondents. Testing confirmed a high statistical significance of differences of variables of psychoticism in the group of respondents for p < 0.001 to p < 0.01. A statistically significant representation of the dimension of psychoticism in the polytoxicomaniac group was established. The presence of factors concerning common executive dysfunction was emphasized.
Bullock, Garrett S; Brookreson, Nate; Knab, Amy M; Butler, Robert J
2017-06-01
Abnormal fundamental movement patterns and upper-quarter dynamic balance are proposed mechanisms affecting athletic performance and injury risk. There are few studies investigating functional movement and closed-chain upper-extremity dynamic stability in swimmers. The purpose of this study was to determine differences in fundamental movement competency and closed-chain upper-extremity dynamic balance, using the Functional Movement Screen (FMS) and Upper-Quarter Y Balance Test (YBT-UQ), of high school (HS; n = 70) and collegiate (COL; n = 70) swimmers. Variables included the individual movement tests on the FMS and the average normalized reach (percent limb length [%LL]) for each direction, with the YBT-UQ. Statistical analysis was completed using a chi square for the independent test scores on the FMS while independent samples t-test to examine performance on the YBT-UQ (p ≤ 0.05). HS swimmers exhibited a statistically significant greater percentage of below average performance (score of 0 or 1) on the following FMS tests: lunge (HS: 22.9%, COL: 4.3%), hurdle step (HS: 31.4%, COL: 7.1%), and push-up (HS: 61.4%, COL: 31.4%). Furthermore, COL males performed worse in the lunge (male: 9%, female: 0%), whereas COL females had poorer efficiency in the push-up (male: 17.6%, female: 44%). Significant effects of competition level and sex were observed in YBT-UQ medial reach (HS: female 92.06, male 101.63; COL: female 101.3, male 101.5% LL). Individual fundamental movement patterns that involved lumbopelvic neuromuscular control differed between HS and COL swimmers. General upper-extremity dynamic balance differed between competition levels. These data may be helpful in understanding injury and performance-based normative data for participation and return to swimming.
Gismervik, Sigmund Ø; Drogset, Jon O; Granviken, Fredrik; Rø, Magne; Leivseth, Gunnar
2017-01-25
Physical examination tests of the shoulder (PETS) are clinical examination maneuvers designed to aid the assessment of shoulder complaints. Despite more than 180 PETS described in the literature, evidence of their validity and usefulness in diagnosing the shoulder is questioned. This meta-analysis aims to use diagnostic odds ratio (DOR) to evaluate how much PETS shift overall probability and to rank the test performance of single PETS in order to aid the clinician's choice of which tests to use. This study adheres to the principles outlined in the Cochrane guidelines and the PRISMA statement. A fixed effect model was used to assess the overall diagnostic validity of PETS by pooling DOR for different PETS with similar biomechanical rationale when possible. Single PETS were assessed and ranked by DOR. Clinical performance was assessed by sensitivity, specificity, accuracy and likelihood ratio. Six thousand nine-hundred abstracts and 202 full-text articles were assessed for eligibility; 20 articles were eligible and data from 11 articles could be included in the meta-analysis. All PETS for SLAP (superior labral anterior posterior) lesions pooled gave a DOR of 1.38 [1.13, 1.69]. The Supraspinatus test for any full thickness rotator cuff tear obtained the highest DOR of 9.24 (sensitivity was 0.74, specificity 0.77). Compression-Rotation test obtained the highest DOR (6.36) among single PETS for SLAP lesions (sensitivity 0.43, specificity 0.89) and Hawkins test obtained the highest DOR (2.86) for impingement syndrome (sensitivity 0.58, specificity 0.67). No single PETS showed superior clinical test performance. The clinical performance of single PETS is limited. However, when the different PETS for SLAP lesions were pooled, we found a statistical significant change in post-test probability indicating an overall statistical validity. We suggest that clinicians choose their PETS among those with the highest pooled DOR and to assess validity to their own specific clinical settings, review the inclusion criteria of the included primary studies. We further propose that future studies on the validity of PETS use randomized research designs rather than the accuracy design relying less on well-established gold standard reference tests and efficient treatment options.
Hannon, Brenda
2012-11-01
This study uses analysis of co-variance in order to determine which cognitive/learning (working memory, knowledge integration, epistemic belief of learning) or social/personality factors (test anxiety, performance-avoidance goals) might account for gender differences in SAT-V, SAT-M, and overall SAT scores. The results revealed that none of the cognitive/learning factors accounted for gender differences in SAT performance. However, the social/personality factors of test anxiety and performance-avoidance goals each separately accounted for all of the significant gender differences in SAT-V, SAT-M, and overall SAT performance. Furthermore, when the influences of both of these factors were statistically removed simultaneously, all non-significant gender differences reduced further to become trivial by Cohen's (1988) standards. Taken as a whole, these results suggest that gender differences in SAT-V, SAT-M, and overall SAT performance are a consequence of social/learning factors.
Significance tests for functional data with complex dependence structure.
Staicu, Ana-Maria; Lahiri, Soumen N; Carroll, Raymond J
2015-01-01
We propose an L 2 -norm based global testing procedure for the null hypothesis that multiple group mean functions are equal, for functional data with complex dependence structure. Specifically, we consider the setting of functional data with a multilevel structure of the form groups-clusters or subjects-units, where the unit-level profiles are spatially correlated within the cluster, and the cluster-level data are independent. Orthogonal series expansions are used to approximate the group mean functions and the test statistic is estimated using the basis coefficients. The asymptotic null distribution of the test statistic is developed, under mild regularity conditions. To our knowledge this is the first work that studies hypothesis testing, when data have such complex multilevel functional and spatial structure. Two small-sample alternatives, including a novel block bootstrap for functional data, are proposed, and their performance is examined in simulation studies. The paper concludes with an illustration of a motivating experiment.
Testing Foreign Language Impact on Engineering Students' Scientific Problem-Solving Performance
ERIC Educational Resources Information Center
Tatzl, Dietmar; Messnarz, Bernd
2013-01-01
This article investigates the influence of English as the examination language on the solution of physics and science problems by non-native speakers in tertiary engineering education. For that purpose, a statistically significant total number of 96 students in four year groups from freshman to senior level participated in a testing experiment in…
A Comparison of Methods to Test for Mediation in Multisite Experiments
ERIC Educational Resources Information Center
Pituch, Keenan A.; Whittaker, Tiffany A.; Stapleton, Laura M.
2005-01-01
A Monte Carlo study extended the research of MacKinnon, Lockwood, Hoffman, West, and Sheets (2002) for single-level designs by examining the statistical performance of four methods to test for mediation in a multilevel experimental design. The design studied was a two-group experiment that was replicated across several sites, included a single…
Evaluation program for secondary spacecraft cells
NASA Technical Reports Server (NTRS)
Christy, D. E.; Harkness, J. D.
1973-01-01
A life cycle test of secondary electric batteries for spacecraft applications was conducted. A sample number of nickel cadmium batteries were subjected to general performance tests to determine the limit of their actual capabilities. Weaknesses discovered in cell design are reported and aid in research and development efforts toward improving the reliability of spacecraft batteries. A statistical analysis of the life cycle prediction and cause of failure versus test conditions is provided.
Lambert, Carole; Gagnon, Robert; Nguyen, David; Charlin, Bernard
2009-01-01
Background The Script Concordance test (SCT) is a reliable and valid tool to evaluate clinical reasoning in complex situations where experts' opinions may be divided. Scores reflect the degree of concordance between the performance of examinees and that of a reference panel of experienced physicians. The purpose of this study is to demonstrate SCT's usefulness in radiation oncology. Methods A 90 items radiation oncology SCT was administered to 155 participants. Three levels of experience were tested: medical students (n = 70), radiation oncology residents (n = 38) and radiation oncologists (n = 47). Statistical tests were performed to assess reliability and to document validity. Results After item optimization, the test comprised 30 cases and 70 questions. Cronbach alpha was 0.90. Mean scores were 51.62 (± 8.19) for students, 71.20 (± 9.45) for residents and 76.67 (± 6.14) for radiation oncologists. The difference between the three groups was statistically significant when compared by the Kruskall-Wallis test (p < 0.001). Conclusion The SCT is reliable and useful to discriminate among participants according to their level of experience in radiation oncology. It appears as a useful tool to document the progression of reasoning during residency training. PMID:19203358
The chi-square test of independence.
McHugh, Mary L
2013-01-01
The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others. The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer's V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer's V to produce relative low correlation measures, even for highly significant results.
Evaluation of PCR Systems for Field Screening of Bacillus anthracis
Ozanich, Richard M.; Colburn, Heather A.; Victry, Kristin D.; Bartholomew, Rachel A.; Arce, Jennifer S.; Heredia-Langner, Alejandro; Jarman, Kristin; Kreuzer, Helen W.
2017-01-01
There is little published data on the performance of hand-portable polymerase chain reaction (PCR) systems that can be used by first responders to determine if a suspicious powder contains a potential biothreat agent. We evaluated 5 commercially available hand-portable PCR instruments for detection of Bacillus anthracis. We used a cost-effective, statistically based test plan to evaluate systems at performance levels ranging from 0.85-0.95 lower confidence bound (LCB) of the probability of detection (POD) at confidence levels of 80% to 95%. We assessed specificity using purified genomic DNA from 13 B. anthracis strains and 18 Bacillus near neighbors, potential interference with 22 suspicious powders that are commonly encountered in the field by first responders during suspected biothreat incidents, and the potential for PCR inhibition when B. anthracis spores were spiked into these powders. Our results indicate that 3 of the 5 systems achieved 0.95 LCB of the probability of detection with 95% confidence levels at test concentrations of 2,000 genome equivalents/mL (GE/mL), which is comparable to 2,000 spores/mL. This is more than sufficient sensitivity for screening visible suspicious powders. These systems exhibited no false-positive results or PCR inhibition with common suspicious powders and reliably detected B. anthracis spores spiked into these powders, though some issues with assay controls were observed. Our testing approach enables efficient performance testing using a statistically rigorous and cost-effective test plan to generate performance data that allow users to make informed decisions regarding the purchase and use of field biodetection equipment. PMID:28192050
A statistical model for predicting muscle performance
NASA Astrophysics Data System (ADS)
Byerly, Diane Leslie De Caix
The objective of these studies was to develop a capability for predicting muscle performance and fatigue to be utilized for both space- and ground-based applications. To develop this predictive model, healthy test subjects performed a defined, repetitive dynamic exercise to failure using a Lordex spinal machine. Throughout the exercise, surface electromyography (SEMG) data were collected from the erector spinae using a Mega Electronics ME3000 muscle tester and surface electrodes placed on both sides of the back muscle. These data were analyzed using a 5th order Autoregressive (AR) model and statistical regression analysis. It was determined that an AR derived parameter, the mean average magnitude of AR poles, significantly correlated with the maximum number of repetitions (designated Rmax) that a test subject was able to perform. Using the mean average magnitude of AR poles, a test subject's performance to failure could be predicted as early as the sixth repetition of the exercise. This predictive model has the potential to provide a basis for improving post-space flight recovery, monitoring muscle atrophy in astronauts and assessing the effectiveness of countermeasures, monitoring astronaut performance and fatigue during Extravehicular Activity (EVA) operations, providing pre-flight assessment of the ability of an EVA crewmember to perform a given task, improving the design of training protocols and simulations for strenuous International Space Station assembly EVA, and enabling EVA work task sequences to be planned enhancing astronaut performance and safety. Potential ground-based, medical applications of the predictive model include monitoring muscle deterioration and performance resulting from illness, establishing safety guidelines in the industry for repetitive tasks, monitoring the stages of rehabilitation for muscle-related injuries sustained in sports and accidents, and enhancing athletic performance through improved training protocols while reducing injury.
Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra
2015-01-01
Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level. PMID:25830807
Maulik, Ujjwal; Mallik, Saurav; Mukhopadhyay, Anirban; Bandyopadhyay, Sanghamitra
2015-01-01
Microarray and beadchip are two most efficient techniques for measuring gene expression and methylation data in bioinformatics. Biclustering deals with the simultaneous clustering of genes and samples. In this article, we propose a computational rule mining framework, StatBicRM (i.e., statistical biclustering-based rule mining) to identify special type of rules and potential biomarkers using integrated approaches of statistical and binary inclusion-maximal biclustering techniques from the biological datasets. At first, a novel statistical strategy has been utilized to eliminate the insignificant/low-significant/redundant genes in such way that significance level must satisfy the data distribution property (viz., either normal distribution or non-normal distribution). The data is then discretized and post-discretized, consecutively. Thereafter, the biclustering technique is applied to identify maximal frequent closed homogeneous itemsets. Corresponding special type of rules are then extracted from the selected itemsets. Our proposed rule mining method performs better than the other rule mining algorithms as it generates maximal frequent closed homogeneous itemsets instead of frequent itemsets. Thus, it saves elapsed time, and can work on big dataset. Pathway and Gene Ontology analyses are conducted on the genes of the evolved rules using David database. Frequency analysis of the genes appearing in the evolved rules is performed to determine potential biomarkers. Furthermore, we also classify the data to know how much the evolved rules are able to describe accurately the remaining test (unknown) data. Subsequently, we also compare the average classification accuracy, and other related factors with other rule-based classifiers. Statistical significance tests are also performed for verifying the statistical relevance of the comparative results. Here, each of the other rule mining methods or rule-based classifiers is also starting with the same post-discretized data-matrix. Finally, we have also included the integrated analysis of gene expression and methylation for determining epigenetic effect (viz., effect of methylation) on gene expression level.
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data
Chen, Yi-Hau
2017-01-01
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA. PMID:28622336
A knowledge-based T2-statistic to perform pathway analysis for quantitative proteomic data.
Lai, En-Yu; Chen, Yi-Hau; Wu, Kun-Pin
2017-06-01
Approaches to identify significant pathways from high-throughput quantitative data have been developed in recent years. Still, the analysis of proteomic data stays difficult because of limited sample size. This limitation also leads to the practice of using a competitive null as common approach; which fundamentally implies genes or proteins as independent units. The independent assumption ignores the associations among biomolecules with similar functions or cellular localization, as well as the interactions among them manifested as changes in expression ratios. Consequently, these methods often underestimate the associations among biomolecules and cause false positives in practice. Some studies incorporate the sample covariance matrix into the calculation to address this issue. However, sample covariance may not be a precise estimation if the sample size is very limited, which is usually the case for the data produced by mass spectrometry. In this study, we introduce a multivariate test under a self-contained null to perform pathway analysis for quantitative proteomic data. The covariance matrix used in the test statistic is constructed by the confidence scores retrieved from the STRING database or the HitPredict database. We also design an integrating procedure to retain pathways of sufficient evidence as a pathway group. The performance of the proposed T2-statistic is demonstrated using five published experimental datasets: the T-cell activation, the cAMP/PKA signaling, the myoblast differentiation, and the effect of dasatinib on the BCR-ABL pathway are proteomic datasets produced by mass spectrometry; and the protective effect of myocilin via the MAPK signaling pathway is a gene expression dataset of limited sample size. Compared with other popular statistics, the proposed T2-statistic yields more accurate descriptions in agreement with the discussion of the original publication. We implemented the T2-statistic into an R package T2GA, which is available at https://github.com/roqe/T2GA.
Statistical analysis and digital processing of the Mössbauer spectra
NASA Astrophysics Data System (ADS)
Prochazka, Roman; Tucek, Pavel; Tucek, Jiri; Marek, Jaroslav; Mashlan, Miroslav; Pechousek, Jiri
2010-02-01
This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions.
Mathysen, Danny G P; Aclimandos, Wagih; Roelant, Ella; Wouters, Kristien; Creuzot-Garcher, Catherine; Ringens, Peter J; Hawlina, Marko; Tassignon, Marie-José
2013-11-01
To investigate whether introduction of item-response theory (IRT) analysis, in parallel to the 'traditional' statistical analysis methods available for performance evaluation of multiple T/F items as used in the European Board of Ophthalmology Diploma (EBOD) examination, has proved beneficial, and secondly, to study whether the overall assessment performance of the current written part of EBOD is sufficiently high (KR-20≥ 0.90) to be kept as examination format in future EBOD editions. 'Traditional' analysis methods for individual MCQ item performance comprise P-statistics, Rit-statistics and item discrimination, while overall reliability is evaluated through KR-20 for multiple T/F items. The additional set of statistical analysis methods for the evaluation of EBOD comprises mainly IRT analysis. These analysis techniques are used to monitor whether the introduction of negative marking for incorrect answers (since EBOD 2010) has a positive influence on the statistical performance of EBOD as a whole and its individual test items in particular. Item-response theory analysis demonstrated that item performance parameters should not be evaluated individually, but should be related to one another. Before the introduction of negative marking, the overall EBOD reliability (KR-20) was good though with room for improvement (EBOD 2008: 0.81; EBOD 2009: 0.78). After the introduction of negative marking, the overall reliability of EBOD improved significantly (EBOD 2010: 0.92; EBOD 2011:0.91; EBOD 2012: 0.91). Although many statistical performance parameters are available to evaluate individual items, our study demonstrates that the overall reliability assessment remains the only crucial parameter to be evaluated allowing comparison. While individual item performance analysis is worthwhile to undertake as secondary analysis, drawing final conclusions seems to be more difficult. Performance parameters need to be related, as shown by IRT analysis. Therefore, IRT analysis has proved beneficial for the statistical analysis of EBOD. Introduction of negative marking has led to a significant increase in the reliability (KR-20 > 0.90), indicating that the current examination format can be kept for future EBOD examinations. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
A maximally selected test of symmetry about zero.
Laska, Eugene; Meisner, Morris; Wanderling, Joseph
2012-11-20
The problem of testing symmetry about zero has a long and rich history in the statistical literature. We introduce a new test that sequentially discards observations whose absolute value is below increasing thresholds defined by the data. McNemar's statistic is obtained at each threshold and the largest is used as the test statistic. We obtain the exact distribution of this maximally selected McNemar and provide tables of critical values and a program for computing p-values. Power is compared with the t-test, the Wilcoxon Signed Rank Test and the Sign Test. The new test, MM, is slightly less powerful than the t-test and Wilcoxon Signed Rank Test for symmetric normal distributions with nonzero medians and substantially more powerful than all three tests for asymmetric mixtures of normal random variables with or without zero medians. The motivation for this test derives from the need to appraise the safety profile of new medications. If pre and post safety measures are obtained, then under the null hypothesis, the variables are exchangeable and the distribution of their difference is symmetric about a zero median. Large pre-post differences are the major concern of a safety assessment. The discarded small observations are not particularly relevant to safety and can reduce power to detect important asymmetry. The new test was utilized on data from an on-road driving study performed to determine if a hypnotic, a drug used to promote sleep, has next day residual effects. Copyright © 2012 John Wiley & Sons, Ltd.
A Statistical Representation of Pyrotechnic Igniter Output
NASA Astrophysics Data System (ADS)
Guo, Shuyue; Cooper, Marcia
2017-06-01
The output of simplified pyrotechnic igniters for research investigations is statistically characterized by monitoring the post-ignition external flow field with Schlieren imaging. Unique to this work is a detailed quantification of all measurable manufacturing parameters (e.g., bridgewire length, charge cavity dimensions, powder bed density) and associated shock-motion variability in the tested igniters. To demonstrate experimental precision of the recorded Schlieren images and developed image processing methodologies, commercial exploding bridgewires using wires of different parameters were tested. Finally, a statistically-significant population of manufactured igniters were tested within the Schlieren arrangement resulting in a characterization of the nominal output. Comparisons between the variances measured throughout the manufacturing processes and the calculated output variance provide insight into the critical device phenomena that dominate performance. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's NNSA under contract DE-AC04-94AL85000.
NASA Astrophysics Data System (ADS)
Verschuur, Gerrit L.
2014-06-01
The archive of IRIS, PLANCK and WMAP data available at the IRSA website of IPAC allows the apparent associations between galactic neutral hydrogen (HI) features and small-scale structure in WMAP and PLANCK data to be closely examined. In addition, HI new observations made with the Green Bank Telescope are used to perform a statistical test of putative associations. It is concluded that attention should be paid to the possibility that some of the small-scale structure found in WMAP and PLANCK data harbors the signature of a previously unrecognized source of high-frequency continuum emission in the Galaxy.
Re-Analysis Report: Daylighting in Schools, Additional Analysis. Tasks 2.2.1 through 2.2.5.
ERIC Educational Resources Information Center
Heschong, Lisa; Elzeyadi, Ihab; Knecht, Carey
This study expands and validates previous research that found a statistical correlation between the amount of daylight in elementary school classrooms and the performance of students on standardized math and reading tests. The researchers reanalyzed the 19971998 school year student performance data from the Capistrano Unified School District…
Pearson's chi-square test and rank correlation inferences for clustered data.
Shih, Joanna H; Fay, Michael P
2017-09-01
Pearson's chi-square test has been widely used in testing for association between two categorical responses. Spearman rank correlation and Kendall's tau are often used for measuring and testing association between two continuous or ordered categorical responses. However, the established statistical properties of these tests are only valid when each pair of responses are independent, where each sampling unit has only one pair of responses. When each sampling unit consists of a cluster of paired responses, the assumption of independent pairs is violated. In this article, we apply the within-cluster resampling technique to U-statistics to form new tests and rank-based correlation estimators for possibly tied clustered data. We develop large sample properties of the new proposed tests and estimators and evaluate their performance by simulations. The proposed methods are applied to a data set collected from a PET/CT imaging study for illustration. Published 2017. This article is a U.S. Government work and is in the public domain in the USA.
Vecchiato, G; De Vico Fallani, F; Astolfi, L; Toppi, J; Cincotti, F; Mattia, D; Salinari, S; Babiloni, F
2010-08-30
This paper presents some considerations about the use of adequate statistical techniques in the framework of the neuroelectromagnetic brain mapping. With the use of advanced EEG/MEG recording setup involving hundred of sensors, the issue of the protection against the type I errors that could occur during the execution of hundred of univariate statistical tests, has gained interest. In the present experiment, we investigated the EEG signals from a mannequin acting as an experimental subject. Data have been collected while performing a neuromarketing experiment and analyzed with state of the art computational tools adopted in specialized literature. Results showed that electric data from the mannequin's head presents statistical significant differences in power spectra during the visualization of a commercial advertising when compared to the power spectra gathered during a documentary, when no adjustments were made on the alpha level of the multiple univariate tests performed. The use of the Bonferroni or Bonferroni-Holm adjustments returned correctly no differences between the signals gathered from the mannequin in the two experimental conditions. An partial sample of recently published literature on different neuroscience journals suggested that at least the 30% of the papers do not use statistical protection for the type I errors. While the occurrence of type I errors could be easily managed with appropriate statistical techniques, the use of such techniques is still not so largely adopted in the literature. Copyright (c) 2010 Elsevier B.V. All rights reserved.
Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong
2017-03-01
Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Correcting evaluation bias of relational classifiers with network cross validation
Neville, Jennifer; Gallagher, Brian; Eliassi-Rad, Tina; ...
2011-01-04
Recently, a number of modeling techniques have been developed for data mining and machine learning in relational and network domains where the instances are not independent and identically distributed (i.i.d.). These methods specifically exploit the statistical dependencies among instances in order to improve classification accuracy. However, there has been little focus on how these same dependencies affect our ability to draw accurate conclusions about the performance of the models. More specifically, the complex link structure and attribute dependencies in relational data violate the assumptions of many conventional statistical tests and make it difficult to use these tests to assess themore » models in an unbiased manner. In this work, we examine the task of within-network classification and the question of whether two algorithms will learn models that will result in significantly different levels of performance. We show that the commonly used form of evaluation (paired t-test on overlapping network samples) can result in an unacceptable level of Type I error. Furthermore, we show that Type I error increases as (1) the correlation among instances increases and (2) the size of the evaluation set increases (i.e., the proportion of labeled nodes in the network decreases). Lastly, we propose a method for network cross-validation that combined with paired t-tests produces more acceptable levels of Type I error while still providing reasonable levels of statistical power (i.e., 1–Type II error).« less
Adaptive statistical pattern classifiers for remotely sensed data
NASA Technical Reports Server (NTRS)
Gonzalez, R. C.; Pace, M. O.; Raulston, H. S.
1975-01-01
A technique for the adaptive estimation of nonstationary statistics necessary for Bayesian classification is developed. The basic approach to the adaptive estimation procedure consists of two steps: (1) an optimal stochastic approximation of the parameters of interest and (2) a projection of the parameters in time or position. A divergence criterion is developed to monitor algorithm performance. Comparative results of adaptive and nonadaptive classifier tests are presented for simulated four dimensional spectral scan data.
Grigoriadis, Themos; Giannoulis, George; Zacharakis, Dimitris; Protopapas, Athanasios; Cardozo, Linda; Athanasiou, Stavros
2016-03-01
The purpose of the study was to examine whether a test performed during urodynamics, the "1-3-5 cough test", could determine the severity of urodynamic stress incontinence (USI). We included women referred for urodynamics who were diagnosed with USI. The "1-3-5 cough test" was performed to grade the severity of USI at the completion of filling cystometry. A diagnosis of "severe", "moderate" or "mild" USI was given if urine leakage was observed after one, three or five consecutive coughs respectively. We examined the associations between grades of USI severity and measures of subjective perception of stress urinary incontinence (SUI): International Consultation of Incontinence Modular Questionnaire-Female Lower Urinary Tract Symptom (ICIQ-FLUTS), King's Health Questionnaire (KHQ), Urinary Distress Inventory-6 (UDI-6), Urinary Impact Questionnaire-7 (UIQ-7). A total of 1,181 patients completed the ICIQ-FLUTS and KHQ and 612 completed the UDI-6 and UIQ-7 questionnaires. There was a statistically significant association of higher grades of USI severity with higher scores of the incontinence domain of the ICIQ-FLUTS. The scores of the UDI-6, UIQ-7 and of all KHQ domains (with the exception of general health perception and personal relationships) had statistically significant larger mean values for higher USI severity grade. Groups of higher USI severity had statistically significant associations with higher scores of most of the subjective measures of SUI. Severity of USI, as defined by the "1-3-5 cough test", was associated with the severity of subjective measures of SUI. This test may be a useful tool for the objective interpretation of patients with SUI who undergo urodynamics.
NASA Astrophysics Data System (ADS)
Rounds, S. A.; Sullivan, A. B.
2004-12-01
Assessing a model's ability to reproduce field data is a critical step in the modeling process. For any model, some method of determining goodness-of-fit to measured data is needed to aid in calibration and to evaluate model performance. Visualizations and graphical comparisons of model output are an excellent way to begin that assessment. At some point, however, model performance must be quantified. Goodness-of-fit statistics, including the mean error (ME), mean absolute error (MAE), root mean square error, and coefficient of determination, typically are used to measure model accuracy. Statistical tools such as the sign test or Wilcoxon test can be used to test for model bias. The runs test can detect phase errors in simulated time series. Each statistic is useful, but each has its limitations. None provides a complete quantification of model accuracy. In this study, a suite of goodness-of-fit statistics was applied to a model of Henry Hagg Lake in northwest Oregon. Hagg Lake is a man-made reservoir on Scoggins Creek, a tributary to the Tualatin River. Located on the west side of the Portland metropolitan area, the Tualatin Basin is home to more than 450,000 people. Stored water in Hagg Lake helps to meet the agricultural and municipal water needs of that population. Future water demands have caused water managers to plan for a potential expansion of Hagg Lake, doubling its storage to roughly 115,000 acre-feet. A model of the lake was constructed to evaluate the lake's water quality and estimate how that quality might change after raising the dam. The laterally averaged, two-dimensional, U.S. Army Corps of Engineers model CE-QUAL-W2 was used to construct the Hagg Lake model. Calibrated for the years 2000 and 2001 and confirmed with data from 2002 and 2003, modeled parameters included water temperature, ammonia, nitrate, phosphorus, algae, zooplankton, and dissolved oxygen. Several goodness-of-fit statistics were used to quantify model accuracy and bias. Model performance was judged to be excellent for water temperature (annual ME: -0.22 to 0.05 ° C; annual MAE: 0.62 to 0.68 ° C) and dissolved oxygen (annual ME: -0.28 to 0.18 mg/L; annual MAE: 0.43 to 0.92 mg/L), showing that the model is sufficiently accurate for future water resources planning and management.
Kivlan, Benjamin R; Carcia, Christopher R; Christoforetti, John J; Martin, RobRoy L
2016-08-01
Dancers commonly experience anterior hip pain caused by femoroacetabular impingement (FAI) that interrupts training and performance in dance. A paucity of literature exists to guide appropriate evaluation and management of FAI among dancers. The purpose of this study was to determine if dancers with clinical signs of FAI have differences in hip range of motion, strength, and hop test performance compared to healthy dancers. Quasi-experimental, cohort comparison. Fifteen dancers aged between 18- 21 years with clinical signs of FAI that included anterior hip pain and provocative impingement tests were compared to 13 age-matched dancers for passive hip joint range of motion, isometric hip strength, and performance of the medial triple hop, lateral triple hop, and cross-over hop tests. No statistically significant differences in range of motion were noted for flexion (Healthy = 145° + 7°; FAI = 147° + 10°; p=0.59), internal rotation (Healthy = 63° + 7°; FAI = 61° + 11°; p=0.50), and external rotation (Healthy = 37° + 9°; FAI = 34° + 12°; p=0.68) between the two groups. Hip extension strength was significantly less in the dancers with FAI (224 + 55 Newtons) compared to the healthy group (293 ± 58 Newtons; F(1,26) = 10.2; p=0.004). No statistically significant differences were noted for flexion, internal rotation, external rotation, abduction, or adduction isometric strength. The medial triple hop test was significantly less in the FAI group (354 ± 43 cm) compared to the healthy group (410 ± 50 cm; F(1,26) = 10.3; p = 0.004). Similar results were observed for the lateral hop test, as the FAI group (294 ± 38 cm) performed worse than the healthy controls (344 ± 54cm; F(1,26) = 7.8; p = 0.01). There was no statistically significant difference between the FAI group (2.7 ± 0.92 seconds) and the healthy group (2.5 ± 0.75 seconds) on the crossover hop test. Dancers with FAI have less strength of the hip extensors and perform worse during medial and lateral hop triple tests compared to healthy dancers. Clinicians may use this information to assist in screening of dancers with complaints of hip pain and to measure their progress for return to dance. 3B, non-consectutive cohort study.
Carcia, Christopher R.; Christoforetti, John J.; Martin, RobRoy L.
2016-01-01
ABSTRACT Background Dancers commonly experience anterior hip pain caused by femoroacetabular impingement (FAI) that interrupts training and performance in dance. A paucity of literature exists to guide appropriate evaluation and management of FAI among dancers. Purpose The purpose of this study was to determine if dancers with clinical signs of FAI have differences in hip range of motion, strength, and hop test performance compared to healthy dancers. Study Design Quasi-experimental, cohort comparison. Methods Fifteen dancers aged between 18- 21 years with clinical signs of FAI that included anterior hip pain and provocative impingement tests were compared to 13 age-matched dancers for passive hip joint range of motion, isometric hip strength, and performance of the medial triple hop, lateral triple hop, and cross-over hop tests. Results No statistically significant differences in range of motion were noted for flexion (Healthy = 145° + 7°; FAI = 147° + 10°; p=0.59), internal rotation (Healthy = 63° + 7°; FAI = 61° + 11°; p=0.50), and external rotation (Healthy = 37° + 9°; FAI = 34° + 12°; p=0.68) between the two groups. Hip extension strength was significantly less in the dancers with FAI (224 + 55 Newtons) compared to the healthy group (293 ± 58 Newtons; F(1,26) = 10.2; p=0.004). No statistically significant differences were noted for flexion, internal rotation, external rotation, abduction, or adduction isometric strength. The medial triple hop test was significantly less in the FAI group (354 ± 43 cm) compared to the healthy group (410 ± 50 cm; F(1,26) = 10.3; p = 0.004). Similar results were observed for the lateral hop test, as the FAI group (294 ± 38 cm) performed worse than the healthy controls (344 ± 54cm; F(1,26) = 7.8; p = 0.01). There was no statistically significant difference between the FAI group (2.7 ± 0.92 seconds) and the healthy group (2.5 ± 0.75 seconds) on the crossover hop test. Conclusion Dancers with FAI have less strength of the hip extensors and perform worse during medial and lateral hop triple tests compared to healthy dancers. Clinicians may use this information to assist in screening of dancers with complaints of hip pain and to measure their progress for return to dance. Level of Evidence 3B, non-consectutive cohort study PMID:27525177
Simple prognostic model for patients with advanced cancer based on performance status.
Jang, Raymond W; Caraiscos, Valerie B; Swami, Nadia; Banerjee, Subrata; Mak, Ernie; Kaya, Ebru; Rodin, Gary; Bryson, John; Ridley, Julia Z; Le, Lisa W; Zimmermann, Camilla
2014-09-01
Providing survival estimates is important for decision making in oncology care. The purpose of this study was to provide survival estimates for outpatients with advanced cancer, using the Eastern Cooperative Oncology Group (ECOG), Palliative Performance Scale (PPS), and Karnofsky Performance Status (KPS) scales, and to compare their ability to predict survival. ECOG, PPS, and KPS were completed by physicians for each new patient attending the Princess Margaret Cancer Centre outpatient Oncology Palliative Care Clinic (OPCC) from April 2007 to February 2010. Survival analysis was performed using the Kaplan-Meier method. The log-rank test for trend was employed to test for differences in survival curves for each level of performance status (PS), and the concordance index (C-statistic) was used to test the predictive discriminatory ability of each PS measure. Measures were completed for 1,655 patients. PS delineated survival well for all three scales according to the log-rank test for trend (P < .001). Survival was approximately halved for each worsening performance level. Median survival times, in days, for each ECOG level were: EGOG 0, 293; ECOG 1, 197; ECOG 2, 104; ECOG 3, 55; and ECOG 4, 25.5. Median survival times, in days, for PPS (and KPS) were: PPS/KPS 80-100, 221 (215); PPS/KPS 60 to 70, 115 (119); PPS/KPS 40 to 50, 51 (49); PPS/KPS 10 to 30, 22 (29). The C-statistic was similar for all three scales and ranged from 0.63 to 0.64. We present a simple tool that uses PS alone to prognosticate in advanced cancer, and has similar discriminatory ability to more complex models. Copyright © 2014 by American Society of Clinical Oncology.
Rohrmeier, Martin A; Cross, Ian
2014-07-01
Humans rapidly learn complex structures in various domains. Findings of above-chance performance of some untrained control groups in artificial grammar learning studies raise questions about the extent to which learning can occur in an untrained, unsupervised testing situation with both correct and incorrect structures. The plausibility of unsupervised online-learning effects was modelled with n-gram, chunking and simple recurrent network models. A novel evaluation framework was applied, which alternates forced binary grammaticality judgments and subsequent learning of the same stimulus. Our results indicate a strong online learning effect for n-gram and chunking models and a weaker effect for simple recurrent network models. Such findings suggest that online learning is a plausible effect of statistical chunk learning that is possible when ungrammatical sequences contain a large proportion of grammatical chunks. Such common effects of continuous statistical learning may underlie statistical and implicit learning paradigms and raise implications for study design and testing methodologies. Copyright © 2014 Elsevier Inc. All rights reserved.
Novick, Steven; Shen, Yan; Yang, Harry; Peterson, John; LeBlond, Dave; Altan, Stan
2015-01-01
Dissolution (or in vitro release) studies constitute an important aspect of pharmaceutical drug development. One important use of such studies is for justifying a biowaiver for post-approval changes which requires establishing equivalence between the new and old product. We propose a statistically rigorous modeling approach for this purpose based on the estimation of what we refer to as the F2 parameter, an extension of the commonly used f2 statistic. A Bayesian test procedure is proposed in relation to a set of composite hypotheses that capture the similarity requirement on the absolute mean differences between test and reference dissolution profiles. Several examples are provided to illustrate the application. Results of our simulation study comparing the performance of f2 and the proposed method show that our Bayesian approach is comparable to or in many cases superior to the f2 statistic as a decision rule. Further useful extensions of the method, such as the use of continuous-time dissolution modeling, are considered.
OSPAR standard method and software for statistical analysis of beach litter data.
Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit
2017-09-15
The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
Menkes, Daniel L; Reed, Mary
2008-01-01
To determine the effectiveness of didactic case-based instruction methodology to improve medical student comprehension of common neurological illnesses and neurological emergencies. Neurology department, academic university. 415 third and fourth year medical students performing a required four week neurology clerkship. Raw test scores on a 1 hour, 50-item clinical vignette based examination and open-ended questions in a post-clerkship feedback session. There was a statistically significant improvement in overall test scores (p<0.001). Didactic teaching sessions have a significant positive impact on neurology student clerkship test score performance and perception of their educational experience. Confirmation of these results across multiple specialties in a multi-center trial is warranted.
NASA Astrophysics Data System (ADS)
Li, Jing; Singh, Chandralekha
2012-02-01
We discuss the development of a research-based conceptual multiple-choice survey of magnetism. We also discuss the use of the survey to investigate gender differences in students' difficulties with concepts related to magnetism. We find that while there was no gender difference on the pre-test. However, female students performed significantly worse than male students when the survey was given as a post-test in traditionally taught calculus-based introductory physics courses with similar results in both the regular and honors versions of the course. In the algebra-based courses, the performance of female and male students has no statistical difference on the pre-test or the post-test.
Benchmarking and performance analysis of the CM-2. [SIMD computer
NASA Technical Reports Server (NTRS)
Myers, David W.; Adams, George B., II
1988-01-01
A suite of benchmarking routines testing communication, basic arithmetic operations, and selected kernel algorithms written in LISP and PARIS was developed for the CM-2. Experiment runs are automated via a software framework that sequences individual tests, allowing for unattended overnight operation. Multiple measurements are made and treated statistically to generate well-characterized results from the noisy values given by cm:time. The results obtained provide a comparison with similar, but less extensive, testing done on a CM-1. Tests were chosen to aid the algorithmist in constructing fast, efficient, and correct code on the CM-2, as well as gain insight into what performance criteria are needed when evaluating parallel processing machines.
The impact of Lean bundles on hospital performance: does size matter?
Al-Hyari, Khalil; Abu Hammour, Sewar; Abu Zaid, Mohammad Khair Saleem; Haffar, Mohamed
2016-10-10
Purpose The purpose of this paper is to study the effect of the implementation of Lean bundles on hospital performance in private hospitals in Jordan and evaluate how much the size of organization can affect the relationship between Lean bundles implementation and hospital performance. Design/methodology/approach The research is considered as quantitative method (descriptive and hypothesis testing). Three statistical techniques were adopted to analyse the data. Structural equation modeling techniques and multi-group analysis were used to examine the research's hypothesis, and to perform the required statistical analysis of the data from the survey. Reliability analysis and confirmatory factor analysis were used to test the construct validity, reliability and measurement loadings that were performed. Findings Lean bundles have been identified as an effective approach that can dramatically improve the organizational performance of private hospitals in Jordan. Main Lean bundles - just in time, human resource management, and total quality management are applicable to large, small and medium hospitals without significant differences in advantages that depend on size. Originality/value According to the researchers' best knowledge, this is the first research that studies the impact of Lean bundles implementation in healthcare sector in Jordan. This research also makes a significant contribution for decision makers in healthcare to increase their awareness of Lean bundles.
Impaired Statistical Learning in Developmental Dyslexia
Thiessen, Erik D.; Holt, Lori L.
2015-01-01
Purpose Developmental dyslexia (DD) is commonly thought to arise from phonological impairments. However, an emerging perspective is that a more general procedural learning deficit, not specific to phonological processing, may underlie DD. The current study examined if individuals with DD are capable of extracting statistical regularities across sequences of passively experienced speech and nonspeech sounds. Such statistical learning is believed to be domain-general, to draw upon procedural learning systems, and to relate to language outcomes. Method DD and control groups were familiarized with a continuous stream of syllables or sine-wave tones, the ordering of which was defined by high or low transitional probabilities across adjacent stimulus pairs. Participants subsequently judged two 3-stimulus test items with either high or low statistical coherence as being the most similar to the sounds heard during familiarization. Results As with control participants, the DD group was sensitive to the transitional probability structure of the familiarization materials as evidenced by above-chance performance. However, the performance of participants with DD was significantly poorer than controls across linguistic and nonlinguistic stimuli. In addition, reading-related measures were significantly correlated with statistical learning performance of both speech and nonspeech material. Conclusion Results are discussed in light of procedural learning impairments among participants with DD. PMID:25860795
Caballero Morales, Santiago Omar
2013-01-01
The application of Preventive Maintenance (PM) and Statistical Process Control (SPC) are important practices to achieve high product quality, small frequency of failures, and cost reduction in a production process. However there are some points that have not been explored in depth about its joint application. First, most SPC is performed with the X-bar control chart which does not fully consider the variability of the production process. Second, many studies of design of control charts consider just the economic aspect while statistical restrictions must be considered to achieve charts with low probabilities of false detection of failures. Third, the effect of PM on processes with different failure probability distributions has not been studied. Hence, this paper covers these points, presenting the Economic Statistical Design (ESD) of joint X-bar-S control charts with a cost model that integrates PM with general failure distribution. Experiments showed statistically significant reductions in costs when PM is performed on processes with high failure rates and reductions in the sampling frequency of units for testing under SPC. PMID:23527082
Statistical learning in social action contexts.
Monroy, Claire; Meyer, Marlene; Gerson, Sarah; Hunnius, Sabine
2017-01-01
Sensitivity to the regularities and structure contained within sequential, goal-directed actions is an important building block for generating expectations about the actions we observe. Until now, research on statistical learning for actions has solely focused on individual action sequences, but many actions in daily life involve multiple actors in various interaction contexts. The current study is the first to investigate the role of statistical learning in tracking regularities between actions performed by different actors, and whether the social context characterizing their interaction influences learning. That is, are observers more likely to track regularities across actors if they are perceived as acting jointly as opposed to in parallel? We tested adults and toddlers to explore whether social context guides statistical learning and-if so-whether it does so from early in development. In a between-subjects eye-tracking experiment, participants were primed with a social context cue between two actors who either shared a goal of playing together ('Joint' condition) or stated the intention to act alone ('Parallel' condition). In subsequent videos, the actors performed sequential actions in which, for certain action pairs, the first actor's action reliably predicted the second actor's action. We analyzed predictive eye movements to upcoming actions as a measure of learning, and found that both adults and toddlers learned the statistical regularities across actors when their actions caused an effect. Further, adults with high statistical learning performance were sensitive to social context: those who observed actors with a shared goal were more likely to correctly predict upcoming actions. In contrast, there was no effect of social context in the toddler group, regardless of learning performance. These findings shed light on how adults and toddlers perceive statistical regularities across actors depending on the nature of the observed social situation and the resulting effects.
Statistical learning in social action contexts
Meyer, Marlene; Gerson, Sarah; Hunnius, Sabine
2017-01-01
Sensitivity to the regularities and structure contained within sequential, goal-directed actions is an important building block for generating expectations about the actions we observe. Until now, research on statistical learning for actions has solely focused on individual action sequences, but many actions in daily life involve multiple actors in various interaction contexts. The current study is the first to investigate the role of statistical learning in tracking regularities between actions performed by different actors, and whether the social context characterizing their interaction influences learning. That is, are observers more likely to track regularities across actors if they are perceived as acting jointly as opposed to in parallel? We tested adults and toddlers to explore whether social context guides statistical learning and—if so—whether it does so from early in development. In a between-subjects eye-tracking experiment, participants were primed with a social context cue between two actors who either shared a goal of playing together (‘Joint’ condition) or stated the intention to act alone (‘Parallel’ condition). In subsequent videos, the actors performed sequential actions in which, for certain action pairs, the first actor’s action reliably predicted the second actor’s action. We analyzed predictive eye movements to upcoming actions as a measure of learning, and found that both adults and toddlers learned the statistical regularities across actors when their actions caused an effect. Further, adults with high statistical learning performance were sensitive to social context: those who observed actors with a shared goal were more likely to correctly predict upcoming actions. In contrast, there was no effect of social context in the toddler group, regardless of learning performance. These findings shed light on how adults and toddlers perceive statistical regularities across actors depending on the nature of the observed social situation and the resulting effects. PMID:28475619
Measurement of the relationship between perceived and computed color differences
NASA Astrophysics Data System (ADS)
García, Pedro A.; Huertas, Rafael; Melgosa, Manuel; Cui, Guihua
2007-07-01
Using simulated data sets, we have analyzed some mathematical properties of different statistical measurements that have been employed in previous literature to test the performance of different color-difference formulas. Specifically, the properties of the combined index PF/3 (performance factor obtained as average of three terms), widely employed in current literature, have been considered. A new index named standardized residual sum of squares (STRESS), employed in multidimensional scaling techniques, is recommended. The main difference between PF/3 and STRESS is that the latter is simpler and allows inferences on the statistical significance of two color-difference formulas with respect to a given set of visual data.
Computation of large-scale statistics in decaying isotropic turbulence
NASA Technical Reports Server (NTRS)
Chasnov, Jeffrey R.
1993-01-01
We have performed large-eddy simulations of decaying isotropic turbulence to test the prediction of self-similar decay of the energy spectrum and to compute the decay exponents of the kinetic energy. In general, good agreement between the simulation results and the assumption of self-similarity were obtained. However, the statistics of the simulations were insufficient to compute the value of gamma which corrects the decay exponent when the spectrum follows a k(exp 4) wave number behavior near k = 0. To obtain good statistics, it was found necessary to average over a large ensemble of turbulent flows.
NASA Technical Reports Server (NTRS)
1990-01-01
Structural Reliability Consultants' computer program creates graphic plots showing the statistical parameters of glue laminated timbers, or 'glulam.' The company president, Dr. Joseph Murphy, read in NASA Tech Briefs about work related to analysis of Space Shuttle surface tile strength performed for Johnson Space Center by Rockwell International Corporation. Analysis led to a theory of 'consistent tolerance bounds' for statistical distributions, applicable in industrial testing where statistical analysis can influence product development and use. Dr. Murphy then obtained the Tech Support Package that covers the subject in greater detail. The TSP became the basis for Dr. Murphy's computer program PC-DATA, which he is marketing commercially.
Lack of grading agreement among international hemostasis external quality assessment programs
Olson, John D.; Jennings, Ian; Meijer, Piet; Bon, Chantal; Bonar, Roslyn; Favaloro, Emmanuel J.; Higgins, Russell A.; Keeney, Michael; Mammen, Joy; Marlar, Richard A.; Meley, Roland; Nair, Sukesh C.; Nichols, William L.; Raby, Anne; Reverter, Joan C.; Srivastava, Alok; Walker, Isobel
2018-01-01
Laboratory quality programs rely on internal quality control and external quality assessment (EQA). EQA programs provide unknown specimens for the laboratory to test. The laboratory's result is compared with other (peer) laboratories performing the same test. EQA programs assign target values using a variety of methods statistical tools and performance assessment of ‘pass’ or ‘fail’ is made. EQA provider members of the international organization, external quality assurance in thrombosis and hemostasis, took part in a study to compare outcome of performance analysis using the same data set of laboratory results. Eleven EQA organizations using eight different analytical approaches participated. Data for a normal and prolonged activated partial thromboplastin time (aPTT) and a normal and reduced factor VIII (FVIII) from 218 laboratories were sent to the EQA providers who analyzed the data set using their method of evaluation for aPTT and FVIII, determining the performance for each laboratory record in the data set. Providers also summarized their statistical approach to assignment of target values and laboratory performance. Each laboratory record in the data set was graded pass/fail by all EQA providers for each of the four analytes. There was a lack of agreement of pass/fail grading among EQA programs. Discordance in the grading was 17.9 and 11% of normal and prolonged aPTT results, respectively, and 20.2 and 17.4% of normal and reduced FVIII results, respectively. All EQA programs in this study employed statistical methods compliant with the International Standardization Organization (ISO), ISO 13528, yet the evaluation of laboratory results for all four analytes showed remarkable grading discordance. PMID:29232255
Alemu, Sisay Mulugeta; Haile, Yohannes Gebreegziabhere
2017-01-01
Background Globally 3 to 8% of reproductive age women are suffering from premenstrual dysphoric disorder (PMDD). Several mental and reproductive health-related factors cause low academic achievement during university education. However, limited data exist in Ethiopia. The aim of the study was to investigate mental and reproductive health correlates of academic performance. Methods Institution based cross-sectional study was conducted with 667 Debre Berhan University female students from April to June 2015. Academic performance was the outcome variable. Mental and reproductive health characteristics were explanatory variables. Two-way analysis of variance (ANOVA) test of association was applied to examine group difference in academic performance. Result Among 529 students who participated, 49.3% reported mild premenstrual syndrome (PMS), 36.9% reported moderate/severe PMS, and 13.8% fulfilled PMDD diagnostic criteria. The ANOVA test of association revealed that there was no significant difference in academic performance between students with different level of PMS experience (F-statistic = 0.08, p value = 0.93). Nevertheless, there was a significant difference in academic performance between students with different length of menses (F-statistic = 5.15, p value = 0.006). Conclusion There was no significant association between PMS experience and academic performance, but on the other hand, the length of menses significantly associated with academic performance. PMID:28630874
Measuring the Sensitivity of Single-locus “Neutrality Tests” Using a Direct Perturbation Approach
Garrigan, Daniel; Lewontin, Richard; Wakeley, John
2010-01-01
A large number of statistical tests have been proposed to detect natural selection based on a sample of variation at a single genetic locus. These tests measure the deviation of the allelic frequency distribution observed within populations from the distribution expected under a set of assumptions that includes both neutral evolution and equilibrium population demography. The present study considers a new way to assess the statistical properties of these tests of selection, by their behavior in response to direct perturbations of the steady-state allelic frequency distribution, unconstrained by any particular nonequilibrium demographic scenario. Results from Monte Carlo computer simulations indicate that most tests of selection are more sensitive to perturbations of the allele frequency distribution that increase the variance in allele frequencies than to perturbations that decrease the variance. Simulations also demonstrate that it requires, on average, 4N generations (N is the diploid effective population size) for tests of selection to relax to their theoretical, steady-state distributions following different perturbations of the allele frequency distribution to its extremes. This relatively long relaxation time highlights the fact that these tests are not robust to violations of the other assumptions of the null model besides neutrality. Lastly, genetic variation arising under an example of a regularly cycling demographic scenario is simulated. Tests of selection performed on this last set of simulated data confirm the confounding nature of these tests for the inference of natural selection, under a demographic scenario that likely holds for many species. The utility of using empirical, genomic distributions of test statistics, instead of the theoretical steady-state distribution, is discussed as an alternative for improving the statistical inference of natural selection. PMID:19744997