statistical tests results: Topics by Science.gov

Sample records for statistical tests results

Biomechanical Analysis of Military Boots. Phase 1. Materials Testing of Military and Commercial Footwear

DTIC Science & Technology

1992-10-01

N=8) and Results of 44 Statistical Analyses for Impact Test Performed on Forefoot of Unworn Footwear A-2. Summary Statistics (N=8) and Results of...on Forefoot of Worn Footwear Vlll Tables (continued) Table Page B-2. Summary Statistics (N=4) and Results of 76 Statistical Analyses for Impact...used tests to assess heel and forefoot shock absorption, upper and sole durability, and flexibility (Cavanagh, 1978). Later, the number of tests was
New heterogeneous test statistics for the unbalanced fixed-effect nested design.

PubMed

Guo, Jiin-Huarng; Billard, L; Luh, Wei-Ming

2011-05-01

When the underlying variances are unknown or/and unequal, using the conventional F test is problematic in the two-factor hierarchical data structure. Prompted by the approximate test statistics (Welch and Alexander-Govern methods), the authors develop four new heterogeneous test statistics to test factor A and factor B nested within A for the unbalanced fixed-effect two-stage nested design under variance heterogeneity. The actual significance levels and statistical power of the test statistics were compared in a simulation study. The results show that the proposed procedures maintain better Type I error rate control and have greater statistical power than those obtained by the conventional F test in various conditions. Therefore, the proposed test statistics are recommended in terms of robustness and easy implementation. ©2010 The British Psychological Society.
Study designs, use of statistical tests, and statistical analysis software choice in 2015: Results from two Pakistani monthly Medline indexed journals.

PubMed

Shaikh, Masood Ali

2017-09-01

Assessment of research articles in terms of study designs used, statistical tests applied and the use of statistical analysis programmes help determine research activity profile and trends in the country. In this descriptive study, all original articles published by Journal of Pakistan Medical Association (JPMA) and Journal of the College of Physicians and Surgeons Pakistan (JCPSP), in the year 2015 were reviewed in terms of study designs used, application of statistical tests, and the use of statistical analysis programmes. JPMA and JCPSP published 192 and 128 original articles, respectively, in the year 2015. Results of this study indicate that cross-sectional study design, bivariate inferential statistical analysis entailing comparison between two variables/groups, and use of statistical software programme SPSS to be the most common study design, inferential statistical analysis, and statistical analysis software programmes, respectively. These results echo previously published assessment of these two journals for the year 2014.
Statistical EMC: A new dimension electromagnetic compatibility of digital electronic systems

NASA Astrophysics Data System (ADS)

Tsaliovich, Anatoly

Electromagnetic compatibility compliance test results are used as a database for addressing three classes of electromagnetic-compatibility (EMC) related problems: statistical EMC profiles of digital electronic systems, the effect of equipment-under-test (EUT) parameters on the electromagnetic emission characteristics, and EMC measurement specifics. Open area test site (OATS) and absorber line shielded room (AR) results are compared for equipment-under-test highest radiated emissions. The suggested statistical evaluation methodology can be utilized to correlate the results of different EMC test techniques, characterize the EMC performance of electronic systems and components, and develop recommendations for electronic product optimal EMC design.
Explorations in statistics: hypothesis tests and P values.

PubMed

Curran-Everett, Douglas

2009-06-01

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of Explorations in Statistics delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what we observe in the experiment to what we expect to see if the null hypothesis is true. The P value associated with the magnitude of that test statistic answers this question: if the null hypothesis is true, what proportion of possible values of the test statistic are at least as extreme as the one I got? Although statisticians continue to stress the limitations of hypothesis tests, there are two realities we must acknowledge: hypothesis tests are ingrained within science, and the simple test of a null hypothesis can be useful. As a result, it behooves us to explore the notions of hypothesis tests, test statistics, and P values.
[The research protocol VI: How to choose the appropriate statistical test. Inferential statistics].

PubMed

Flores-Ruiz, Eric; Miranda-Novales, María Guadalupe; Villasís-Keever, Miguel Ángel

2017-01-01

The statistical analysis can be divided in two main components: descriptive analysis and inferential analysis. An inference is to elaborate conclusions from the tests performed with the data obtained from a sample of a population. Statistical tests are used in order to establish the probability that a conclusion obtained from a sample is applicable to the population from which it was obtained. However, choosing the appropriate statistical test in general poses a challenge for novice researchers. To choose the statistical test it is necessary to take into account three aspects: the research design, the number of measurements and the scale of measurement of the variables. Statistical tests are divided into two sets, parametric and nonparametric. Parametric tests can only be used if the data show a normal distribution. Choosing the right statistical test will make it easier for readers to understand and apply the results.
Testing prediction methods: Earthquake clustering versus the Poisson model

USGS Publications Warehouse

Michael, A.J.

1997-01-01

Testing earthquake prediction methods requires statistical techniques that compare observed success to random chance. One technique is to produce simulated earthquake catalogs and measure the relative success of predicting real and simulated earthquakes. The accuracy of these tests depends on the validity of the statistical model used to simulate the earthquakes. This study tests the effect of clustering in the statistical earthquake model on the results. Three simulation models were used to produce significance levels for a VLF earthquake prediction method. As the degree of simulated clustering increases, the statistical significance drops. Hence, the use of a seismicity model with insufficient clustering can lead to overly optimistic results. A successful method must pass the statistical tests with a model that fully replicates the observed clustering. However, a method can be rejected based on tests with a model that contains insufficient clustering. U.S. copyright. Published in 1997 by the American Geophysical Union.
An empirical comparison of statistical tests for assessing the proportional hazards assumption of Cox's model.

PubMed

Ng'andu, N H

1997-03-30

In the analysis of survival data using the Cox proportional hazard (PH) model, it is important to verify that the explanatory variables analysed satisfy the proportional hazard assumption of the model. This paper presents results of a simulation study that compares five test statistics to check the proportional hazard assumption of Cox's model. The test statistics were evaluated under proportional hazards and the following types of departures from the proportional hazard assumption: increasing relative hazards; decreasing relative hazards; crossing hazards; diverging hazards, and non-monotonic hazards. The test statistics compared include those based on partitioning of failure time and those that do not require partitioning of failure time. The simulation results demonstrate that the time-dependent covariate test, the weighted residuals score test and the linear correlation test have equally good power for detection of non-proportionality in the varieties of non-proportional hazards studied. Using illustrative data from the literature, these test statistics performed similarly.
Statistical correlation of structural mode shapes from test measurements and NASTRAN analytical values

NASA Technical Reports Server (NTRS)

Purves, L.; Strang, R. F.; Dube, M. P.; Alea, P.; Ferragut, N.; Hershfeld, D.

1983-01-01

The software and procedures of a system of programs used to generate a report of the statistical correlation between NASTRAN modal analysis results and physical tests results from modal surveys are described. Topics discussed include: a mathematical description of statistical correlation, a user's guide for generating a statistical correlation report, a programmer's guide describing the organization and functions of individual programs leading to a statistical correlation report, and a set of examples including complete listings of programs, and input and output data.
Statistical Significance Testing in Second Language Research: Basic Problems and Suggestions for Reform

ERIC Educational Resources Information Center

Norris, John M.

2015-01-01

Traditions of statistical significance testing in second language (L2) quantitative research are strongly entrenched in how researchers design studies, select analyses, and interpret results. However, statistical significance tests using "p" values are commonly misinterpreted by researchers, reviewers, readers, and others, leading to…
DEIVA: a web application for interactive visual analysis of differential gene expression profiles.

PubMed

Harshbarger, Jayson; Kratz, Anton; Carninci, Piero

2017-01-07

Differential gene expression (DGE) analysis is a technique to identify statistically significant differences in RNA abundance for genes or arbitrary features between different biological states. The result of a DGE test is typically further analyzed using statistical software, spreadsheets or custom ad hoc algorithms. We identified a need for a web-based system to share DGE statistical test results, and locate and identify genes in DGE statistical test results with a very low barrier of entry. We have developed DEIVA, a free and open source, browser-based single page application (SPA) with a strong emphasis on being user friendly that enables locating and identifying single or multiple genes in an immediate, interactive, and intuitive manner. By design, DEIVA scales with very large numbers of users and datasets. Compared to existing software, DEIVA offers a unique combination of design decisions that enable inspection and analysis of DGE statistical test results with an emphasis on ease of use.
Electrochemical Test Method for Evaluating Long-Term Propellant-Material Compatibility

DTIC Science & Technology

1978-12-01

matrix of test conditions is illustrated in Fig. 13. A statistically designed test matrix (Graeco-Latin Cube) could not be used because of passivation...ears simulated time results in a findl decomposition level of 0.753 mg/cm The data was examined using statistical techniqves to evaluate the relative...metals. The compatibility of all nine metals was evaluated in hydrazine containing water and chloride. The results of the statistical analy(is
Assessment of statistical education in Indonesia: Preliminary results and initiation to simulation-based inference

NASA Astrophysics Data System (ADS)

Saputra, K. V. I.; Cahyadi, L.; Sembiring, U. A.

2018-01-01

Start in this paper, we assess our traditional elementary statistics education and also we introduce elementary statistics with simulation-based inference. To assess our statistical class, we adapt the well-known CAOS (Comprehensive Assessment of Outcomes in Statistics) test that serves as an external measure to assess the student’s basic statistical literacy. This test generally represents as an accepted measure of statistical literacy. We also introduce a new teaching method on elementary statistics class. Different from the traditional elementary statistics course, we will introduce a simulation-based inference method to conduct hypothesis testing. From the literature, it has shown that this new teaching method works very well in increasing student’s understanding of statistics.
Your Chi-Square Test Is Statistically Significant: Now What?

ERIC Educational Resources Information Center

Sharpe, Donald

2015-01-01

Applied researchers have employed chi-square tests for more than one hundred years. This paper addresses the question of how one should follow a statistically significant chi-square test result in order to determine the source of that result. Four approaches were evaluated: calculating residuals, comparing cells, ransacking, and partitioning. Data…
[Comparison of application of Cochran-Armitage trend test and linear regression analysis for rate trend analysis in epidemiology study].

PubMed

Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H

2017-05-10

We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
Using the Bootstrap Method for a Statistical Significance Test of Differences between Summary Histograms

NASA Technical Reports Server (NTRS)

Xu, Kuan-Man

2006-01-01

A new method is proposed to compare statistical differences between summary histograms, which are the histograms summed over a large ensemble of individual histograms. It consists of choosing a distance statistic for measuring the difference between summary histograms and using a bootstrap procedure to calculate the statistical significance level. Bootstrapping is an approach to statistical inference that makes few assumptions about the underlying probability distribution that describes the data. Three distance statistics are compared in this study. They are the Euclidean distance, the Jeffries-Matusita distance and the Kuiper distance. The data used in testing the bootstrap method are satellite measurements of cloud systems called cloud objects. Each cloud object is defined as a contiguous region/patch composed of individual footprints or fields of view. A histogram of measured values over footprints is generated for each parameter of each cloud object and then summary histograms are accumulated over all individual histograms in a given cloud-object size category. The results of statistical hypothesis tests using all three distances as test statistics are generally similar, indicating the validity of the proposed method. The Euclidean distance is determined to be most suitable after comparing the statistical tests of several parameters with distinct probability distributions among three cloud-object size categories. Impacts on the statistical significance levels resulting from differences in the total lengths of satellite footprint data between two size categories are also discussed.
Derivation and Applicability of Asymptotic Results for Multiple Subtests Person-Fit Statistics

PubMed Central

Albers, Casper J.; Meijer, Rob R.; Tendeiro, Jorge N.

2016-01-01

In high-stakes testing, it is important to check the validity of individual test scores. Although a test may, in general, result in valid test scores for most test takers, for some test takers, test scores may not provide a good description of a test taker’s proficiency level. Person-fit statistics have been proposed to check the validity of individual test scores. In this study, the theoretical asymptotic sampling distribution of two person-fit statistics that can be used for tests that consist of multiple subtests is first discussed. Second, simulation study was conducted to investigate the applicability of this asymptotic theory for tests of finite length, in which the correlation between subtests and number of items in the subtests was varied. The authors showed that these distributions provide reasonable approximations, even for tests consisting of subtests of only 10 items each. These results have practical value because researchers do not have to rely on extensive simulation studies to simulate sampling distributions. PMID:29881053
The Skillings-Mack test (Friedman test when there are missing data).

PubMed

Chatfield, Mark; Mander, Adrian

2009-04-01

The Skillings-Mack statistic (Skillings and Mack, 1981, Technometrics 23: 171-177) is a general Friedman-type statistic that can be used in almost any block design with an arbitrary missing-data structure. The missing data can be either missing by design, for example, an incomplete block design, or missing completely at random. The Skillings-Mack test is equivalent to the Friedman test when there are no missing data in a balanced complete block design, and the Skillings-Mack test is equivalent to the test suggested in Durbin (1951, British Journal of Psychology, Statistical Section 4: 85-90) for a balanced incomplete block design. The Friedman test was implemented in Stata by Goldstein (1991, Stata Technical Bulletin 3: 26-27) and further developed in Goldstein (2005, Stata Journal 5: 285). This article introduces the skilmack command, which performs the Skillings-Mack test.The skilmack command is also useful when there are many ties or equal ranks (N.B. the Friedman statistic compared with the chi(2) distribution will give a conservative result), as well as for small samples; appropriate results can be obtained by simulating the distribution of the test statistic under the null hypothesis.
New Statistics for Testing Differential Expression of Pathways from Microarray Data

NASA Astrophysics Data System (ADS)

Siu, Hoicheong; Dong, Hua; Jin, Li; Xiong, Momiao

Exploring biological meaning from microarray data is very important but remains a great challenge. Here, we developed three new statistics: linear combination test, quadratic test and de-correlation test to identify differentially expressed pathways from gene expression profile. We apply our statistics to two rheumatoid arthritis datasets. Notably, our results reveal three significant pathways and 275 genes in common in two datasets. The pathways we found are meaningful to uncover the disease mechanisms of rheumatoid arthritis, which implies that our statistics are a powerful tool in functional analysis of gene expression data.
Statistics 101 for Radiologists.

PubMed

Anvari, Arash; Halpern, Elkan F; Samir, Anthony E

2015-10-01

Diagnostic tests have wide clinical applications, including screening, diagnosis, measuring treatment effect, and determining prognosis. Interpreting diagnostic test results requires an understanding of key statistical concepts used to evaluate test efficacy. This review explains descriptive statistics and discusses probability, including mutually exclusive and independent events and conditional probability. In the inferential statistics section, a statistical perspective on study design is provided, together with an explanation of how to select appropriate statistical tests. Key concepts in recruiting study samples are discussed, including representativeness and random sampling. Variable types are defined, including predictor, outcome, and covariate variables, and the relationship of these variables to one another. In the hypothesis testing section, we explain how to determine if observed differences between groups are likely to be due to chance. We explain type I and II errors, statistical significance, and study power, followed by an explanation of effect sizes and how confidence intervals can be used to generalize observed effect sizes to the larger population. Statistical tests are explained in four categories: t tests and analysis of variance, proportion analysis tests, nonparametric tests, and regression techniques. We discuss sensitivity, specificity, accuracy, receiver operating characteristic analysis, and likelihood ratios. Measures of reliability and agreement, including κ statistics, intraclass correlation coefficients, and Bland-Altman graphs and analysis, are introduced. © RSNA, 2015.

Statistical Tutorial | Center for Cancer Research

Cancer.gov

Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean differences, simple and multiple linear regression, ANOVA tests, and Chi-Squared distribution.
"What If" Analyses: Ways to Interpret Statistical Significance Test Results Using EXCEL or "R"

ERIC Educational Resources Information Center

Ozturk, Elif

2012-01-01

The present paper aims to review two motivations to conduct "what if" analyses using Excel and "R" to understand the statistical significance tests through the sample size context. "What if" analyses can be used to teach students what statistical significance tests really do and in applied research either prospectively to estimate what sample size…
Robust inference for group sequential trials.

PubMed

Ganju, Jitendra; Lin, Yunzhi; Zhou, Kefei

2017-03-01

For ethical reasons, group sequential trials were introduced to allow trials to stop early in the event of extreme results. Endpoints in such trials are usually mortality or irreversible morbidity. For a given endpoint, the norm is to use a single test statistic and to use that same statistic for each analysis. This approach is risky because the test statistic has to be specified before the study is unblinded, and there is loss in power if the assumptions that ensure optimality for each analysis are not met. To minimize the risk of moderate to substantial loss in power due to a suboptimal choice of a statistic, a robust method was developed for nonsequential trials. The concept is analogous to diversification of financial investments to minimize risk. The method is based on combining P values from multiple test statistics for formal inference while controlling the type I error rate at its designated value.This article evaluates the performance of 2 P value combining methods for group sequential trials. The emphasis is on time to event trials although results from less complex trials are also included. The gain or loss in power with the combination method relative to a single statistic is asymmetric in its favor. Depending on the power of each individual test, the combination method can give more power than any single test or give power that is closer to the test with the most power. The versatility of the method is that it can combine P values from different test statistics for analysis at different times. The robustness of results suggests that inference from group sequential trials can be strengthened with the use of combined tests. Copyright © 2017 John Wiley & Sons, Ltd.
Assessment of the beryllium lymphocyte proliferation test using statistical process control.

PubMed

Cher, Daniel J; Deubner, David C; Kelsh, Michael A; Chapman, Pamela S; Ray, Rose M

2006-10-01

Despite more than 20 years of surveillance and epidemiologic studies using the beryllium blood lymphocyte proliferation test (BeBLPT) as a measure of beryllium sensitization (BeS) and as an aid for diagnosing subclinical chronic beryllium disease (CBD), improvements in specific understanding of the inhalation toxicology of CBD have been limited. Although epidemiologic data suggest that BeS and CBD risks vary by process/work activity, it has proven difficult to reach specific conclusions regarding the dose-response relationship between workplace beryllium exposure and BeS or subclinical CBD. One possible reason for this uncertainty could be misclassification of BeS resulting from variation in BeBLPT testing performance. The reliability of the BeBLPT, a biological assay that measures beryllium sensitization, is unknown. To assess the performance of four laboratories that conducted this test, we used data from a medical surveillance program that offered testing for beryllium sensitization with the BeBLPT. The study population was workers exposed to beryllium at various facilities over a 10-year period (1992-2001). Workers with abnormal results were offered diagnostic workups for CBD. Our analyses used a standard statistical technique, statistical process control (SPC), to evaluate test reliability. The study design involved a repeated measures analysis of BeBLPT results generated from the company-wide, longitudinal testing. Analytical methods included use of (1) statistical process control charts that examined temporal patterns of variation for the stimulation index, a measure of cell reactivity to beryllium; (2) correlation analysis that compared prior perceptions of BeBLPT instability to the statistical measures of test variation; and (3) assessment of the variation in the proportion of missing test results and how time periods with more missing data influenced SPC findings. During the period of this study, all laboratories displayed variation in test results that were beyond what would be expected due to chance alone. Patterns of test results suggested that variations were systematic. We conclude that laboratories performing the BeBLPT or other similar biological assays of immunological response could benefit from a statistical approach such as SPC to improve quality management.
Significant lexical relationships

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pedersen, T.; Kayaalp, M.; Bruce, R.

Statistical NLP inevitably deals with a large number of rare events. As a consequence, NLP data often violates the assumptions implicit in traditional statistical procedures such as significance testing. We describe a significance test, an exact conditional test, that is appropriate for NLP data and can be performed using freely available software. We apply this test to the study of lexical relationships and demonstrate that the results obtained using this test are both theoretically more reliable and different from the results obtained using previously applied tests.
Statistical Analysis of Compressive and Flexural Test Results on the Sustainable Adobe Reinforced with Steel Wire Mesh

NASA Astrophysics Data System (ADS)

Jokhio, Gul A.; Syed Mohsin, Sharifah M.; Gul, Yasmeen

2018-04-01

It has been established that Adobe provides, in addition to being sustainable and economic, a better indoor air quality without spending extensive amounts of energy as opposed to the modern synthetic materials. The material, however, suffers from weak structural behaviour when subjected to adverse loading conditions. A wide range of mechanical properties has been reported in literature owing to lack of research and standardization. The present paper presents the statistical analysis of the results that were obtained through compressive and flexural tests on Adobe samples. Adobe specimens with and without wire mesh reinforcement were tested and the results were reported. The statistical analysis of these results presents an interesting read. It has been found that the compressive strength of adobe increases by about 43% after adding a single layer of wire mesh reinforcement. This increase is statistically significant. The flexural response of Adobe has also shown improvement with the addition of wire mesh reinforcement, however, the statistical significance of the same cannot be established.
[Statistical approach to evaluate the occurrence of out-of acceptable ranges and accuracy for antimicrobial susceptibility tests in inter-laboratory quality control program].

PubMed

Ueno, Tamio; Matuda, Junichi; Yamane, Nobuhisa

2013-03-01

To evaluate the occurrence of out-of acceptable ranges and accuracy of antimicrobial susceptibility tests, we applied a new statistical tool to the Inter-Laboratory Quality Control Program established by the Kyushu Quality Control Research Group. First, we defined acceptable ranges of minimum inhibitory concentration (MIC) for broth microdilution tests and inhibitory zone diameter for disk diffusion tests on the basis of Clinical and Laboratory Standards Institute (CLSI) M100-S21. In the analysis, more than two out-of acceptable range results in the 20 tests were considered as not allowable according to the CLSI document. Of the 90 participating laboratories, 46 (51%) experienced one or more occurrences of out-of acceptable range results. Then, a binomial test was applied to each participating laboratory. The results indicated that the occurrences of out-of acceptable range results in the 11 laboratories were significantly higher when compared to the CLSI recommendation (allowable rate < or = 0.05). The standard deviation indices(SDI) were calculated by using reported results, mean and standard deviation values for the respective antimicrobial agents tested. In the evaluation of accuracy, mean value from each laboratory was statistically compared with zero using a Student's t-test. The results revealed that 5 of the 11 above laboratories reported erroneous test results that systematically drifted to the side of resistance. In conclusion, our statistical approach has enabled us to detect significantly higher occurrences and source of interpretive errors in antimicrobial susceptibility tests; therefore, this approach can provide us with additional information that can improve the accuracy of the test results in clinical microbiology laboratories.
Test Statistics and Confidence Intervals to Establish Noninferiority between Treatments with Ordinal Categorical Data.

PubMed

Zhang, Fanghong; Miyaoka, Etsuo; Huang, Fuping; Tanaka, Yutaka

2015-01-01

The problem for establishing noninferiority is discussed between a new treatment and a standard (control) treatment with ordinal categorical data. A measure of treatment effect is used and a method of specifying noninferiority margin for the measure is provided. Two Z-type test statistics are proposed where the estimation of variance is constructed under the shifted null hypothesis using U-statistics. Furthermore, the confidence interval and the sample size formula are given based on the proposed test statistics. The proposed procedure is applied to a dataset from a clinical trial. A simulation study is conducted to compare the performance of the proposed test statistics with that of the existing ones, and the results show that the proposed test statistics are better in terms of the deviation from nominal level and the power.
Scaled test statistics and robust standard errors for non-normal data in covariance structure analysis: a Monte Carlo study.

PubMed

Chou, C P; Bentler, P M; Satorra, A

1991-11-01

Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.
Which Statistic Should Be Used to Detect Item Preknowledge When the Set of Compromised Items Is Known?

PubMed

Sinharay, Sandip

2017-09-01

Benefiting from item preknowledge is a major type of fraudulent behavior during educational assessments. Belov suggested the posterior shift statistic for detection of item preknowledge and showed its performance to be better on average than that of seven other statistics for detection of item preknowledge for a known set of compromised items. Sinharay suggested a statistic based on the likelihood ratio test for detection of item preknowledge; the advantage of the statistic is that its null distribution is known. Results from simulated and real data and adaptive and nonadaptive tests are used to demonstrate that the Type I error rate and power of the statistic based on the likelihood ratio test are very similar to those of the posterior shift statistic. Thus, the statistic based on the likelihood ratio test appears promising in detecting item preknowledge when the set of compromised items is known.
The Use of Meta-Analytic Statistical Significance Testing

ERIC Educational Resources Information Center

Polanin, Joshua R.; Pigott, Terri D.

2015-01-01

Meta-analysis multiplicity, the concept of conducting multiple tests of statistical significance within one review, is an underdeveloped literature. We address this issue by considering how Type I errors can impact meta-analytic results, suggest how statistical power may be affected through the use of multiplicity corrections, and propose how…
Prospective Elementary and Secondary School Mathematics Teachers' Statistical Reasoning

ERIC Educational Resources Information Center

Karatoprak, Rabia; Karagöz Akar, Gülseren; Börkan, Bengü

2015-01-01

This study investigated prospective elementary (PEMTs) and secondary (PSMTs) school mathematics teachers' statistical reasoning. The study began with the adaptation of the Statistical Reasoning Assessment (Garfield, 2003) test. Then, the test was administered to 82 PEMTs and 91 PSMTs in a metropolitan city of Turkey. Results showed that both…
Analysis of statistical misconception in terms of statistical reasoning

NASA Astrophysics Data System (ADS)

Maryati, I.; Priatna, N.

2018-05-01

Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
A scan statistic to extract causal gene clusters from case-control genome-wide rare CNV data.

PubMed

Nishiyama, Takeshi; Takahashi, Kunihiko; Tango, Toshiro; Pinto, Dalila; Scherer, Stephen W; Takami, Satoshi; Kishino, Hirohisa

2011-05-26

Several statistical tests have been developed for analyzing genome-wide association data by incorporating gene pathway information in terms of gene sets. Using these methods, hundreds of gene sets are typically tested, and the tested gene sets often overlap. This overlapping greatly increases the probability of generating false positives, and the results obtained are difficult to interpret, particularly when many gene sets show statistical significance. We propose a flexible statistical framework to circumvent these problems. Inspired by spatial scan statistics for detecting clustering of disease occurrence in the field of epidemiology, we developed a scan statistic to extract disease-associated gene clusters from a whole gene pathway. Extracting one or a few significant gene clusters from a global pathway limits the overall false positive probability, which results in increased statistical power, and facilitates the interpretation of test results. In the present study, we applied our method to genome-wide association data for rare copy-number variations, which have been strongly implicated in common diseases. Application of our method to a simulated dataset demonstrated the high accuracy of this method in detecting disease-associated gene clusters in a whole gene pathway. The scan statistic approach proposed here shows a high level of accuracy in detecting gene clusters in a whole gene pathway. This study has provided a sound statistical framework for analyzing genome-wide rare CNV data by incorporating topological information on the gene pathway.
BrightStat.com: free statistics online.

PubMed

Stricker, Daniel

2008-10-01

Powerful software for statistical analysis is expensive. Here I present BrightStat, a statistical software running on the Internet which is free of charge. BrightStat's goals, its main capabilities and functionalities are outlined. Three different sample runs, a Friedman test, a chi-square test, and a step-wise multiple regression are presented. The results obtained by BrightStat are compared with results computed by SPSS, one of the global leader in providing statistical software, and VassarStats, a collection of scripts for data analysis running on the Internet. Elementary statistics is an inherent part of academic education and BrightStat is an alternative to commercial products.
Consequences of common data analysis inaccuracies in CNS trauma injury basic research.

PubMed

Burke, Darlene A; Whittemore, Scott R; Magnuson, David S K

2013-05-15

The development of successful treatments for humans after traumatic brain or spinal cord injuries (TBI and SCI, respectively) requires animal research. This effort can be hampered when promising experimental results cannot be replicated because of incorrect data analysis procedures. To identify and hopefully avoid these errors in future studies, the articles in seven journals with the highest number of basic science central nervous system TBI and SCI animal research studies published in 2010 (N=125 articles) were reviewed for their data analysis procedures. After identifying the most common statistical errors, the implications of those findings were demonstrated by reanalyzing previously published data from our laboratories using the identified inappropriate statistical procedures, then comparing the two sets of results. Overall, 70% of the articles contained at least one type of inappropriate statistical procedure. The highest percentage involved incorrect post hoc t-tests (56.4%), followed by inappropriate parametric statistics (analysis of variance and t-test; 37.6%). Repeated Measures analysis was inappropriately missing in 52.0% of all articles and, among those with behavioral assessments, 58% were analyzed incorrectly. Reanalysis of our published data using the most common inappropriate statistical procedures resulted in a 14.1% average increase in significant effects compared to the original results. Specifically, an increase of 15.5% occurred with Independent t-tests and 11.1% after incorrect post hoc t-tests. Utilizing proper statistical procedures can allow more-definitive conclusions, facilitate replicability of research results, and enable more accurate translation of those results to the clinic.
Statistics Test Questions: Content and Trends

ERIC Educational Resources Information Center

Salcedo, Audy

2014-01-01

This study presents the results of the analysis of a group of teacher-made test questions for statistics courses at the university level. Teachers were asked to submit tests they had used in their previous two semesters. Ninety-seven tests containing 978 questions were gathered and classified according to the SOLO taxonomy (Biggs & Collis,…
Quality of reporting statistics in two Indian pharmacology journals

PubMed Central

Jaykaran; Yadav, Preeti

2011-01-01

Objective: To evaluate the reporting of the statistical methods in articles published in two Indian pharmacology journals. Materials and Methods: All original articles published since 2002 were downloaded from the journals’ (Indian Journal of Pharmacology (IJP) and Indian Journal of Physiology and Pharmacology (IJPP)) website. These articles were evaluated on the basis of appropriateness of descriptive statistics and inferential statistics. Descriptive statistics was evaluated on the basis of reporting of method of description and central tendencies. Inferential statistics was evaluated on the basis of fulfilling of assumption of statistical methods and appropriateness of statistical tests. Values are described as frequencies, percentage, and 95% confidence interval (CI) around the percentages. Results: Inappropriate descriptive statistics was observed in 150 (78.1%, 95% CI 71.7–83.3%) articles. Most common reason for this inappropriate descriptive statistics was use of mean ± SEM at the place of “mean (SD)” or “mean ± SD.” Most common statistical method used was one-way ANOVA (58.4%). Information regarding checking of assumption of statistical test was mentioned in only two articles. Inappropriate statistical test was observed in 61 (31.7%, 95% CI 25.6–38.6%) articles. Most common reason for inappropriate statistical test was the use of two group test for three or more groups. Conclusion: Articles published in two Indian pharmacology journals are not devoid of statistical errors. PMID:21772766
Statistical analysis of 59 inspected SSME HPFTP turbine blades (uncracked and cracked)

NASA Technical Reports Server (NTRS)

Wheeler, John T.

1987-01-01

The numerical results of statistical analysis of the test data of Space Shuttle Main Engine high pressure fuel turbopump second-stage turbine blades, including some with cracks are presented. Several statistical methods use the test data to determine the application of differences in frequency variations between the uncracked and cracked blades.
Proficiency Testing for Determination of Water Content in Toluene of Chemical Reagents by iteration robust statistic technique

NASA Astrophysics Data System (ADS)

Wang, Hao; Wang, Qunwei; He, Ming

2018-05-01

In order to investigate and improve the level of detection technology of water content in liquid chemical reagents of domestic laboratories, proficiency testing provider PT0031 (CNAS) has organized proficiency testing program of water content in toluene, 48 laboratories from 18 provinces/cities/municipals took part in the PT. This paper introduces the implementation process of proficiency testing for determination of water content in toluene, including sample preparation, homogeneity and stability test, the results of statistics of iteration robust statistic technique and analysis, summarized and analyzed those of the different test standards which are widely used in the laboratories, put forward the technological suggestions for the improvement of the test quality of water content. Satisfactory results were obtained by 43 laboratories, amounting to 89.6% of the total participating laboratories.

Testing high SPF sunscreens: a demonstration of the accuracy and reproducibility of the results of testing high SPF formulations by two methods and at different testing sites.

PubMed

Agin, Patricia Poh; Edmonds, Susan H

2002-08-01

The goals of this study were (i) to demonstrate that existing and widely used sun protection factor (SPF) test methodologies can produce accurate and reproducible results for high SPF formulations and (ii) to provide data on the number of test-subjects needed, the variability of the data, and the appropriate exposure increments needed for testing high SPF formulations. Three high SPF formulations were tested, according to the Food and Drug Administration's (FDA) 1993 tentative final monograph (TFM) 'very water resistant' test method and/or the 1978 proposed monograph 'waterproof' test method, within one laboratory. A fourth high SPF formulation was tested at four independent SPF testing laboratories, using the 1978 waterproof SPF test method. All laboratories utilized xenon arc solar simulators. The data illustrate that the testing conducted within one laboratory, following either the 1978 proposed or the 1993 TFM SPF test method, was able to reproducibly determine the SPFs of the formulations tested, using either the statistical analysis method in the proposed monograph or the statistical method described in the TFM. When one formulation was tested at four different laboratories, the anticipated variation in the data owing to the equipment and other operational differences was minimized through the use of the statistical method described in the 1993 monograph. The data illustrate that either the 1978 proposed monograph SPF test method or the 1993 TFM SPF test method can provide accurate and reproducible results for high SPF formulations. Further, these results can be achieved with panels of 20-25 subjects with an acceptable level of variability. Utilization of the statistical controls from the 1993 sunscreen monograph can help to minimize lab-to-lab variability for well-formulated products.
Statistical Analysis of CFD Solutions from the Drag Prediction Workshop

NASA Technical Reports Server (NTRS)

Hemsch, Michael J.

2002-01-01

A simple, graphical framework is presented for robust statistical evaluation of results obtained from N-Version testing of a series of RANS CFD codes. The solutions were obtained by a variety of code developers and users for the June 2001 Drag Prediction Workshop sponsored by the AIAA Applied Aerodynamics Technical Committee. The aerodynamic configuration used for the computational tests is the DLR-F4 wing-body combination previously tested in several European wind tunnels and for which a previous N-Version test had been conducted. The statistical framework is used to evaluate code results for (1) a single cruise design point, (2) drag polars and (3) drag rise. The paper concludes with a discussion of the meaning of the results, especially with respect to predictability, Validation, and reporting of solutions.
A note on generalized Genome Scan Meta-Analysis statistics

PubMed Central

Koziol, James A; Feng, Anne C

2005-01-01

Background Wise et al. introduced a rank-based statistical technique for meta-analysis of genome scans, the Genome Scan Meta-Analysis (GSMA) method. Levinson et al. recently described two generalizations of the GSMA statistic: (i) a weighted version of the GSMA statistic, so that different studies could be ascribed different weights for analysis; and (ii) an order statistic approach, reflecting the fact that a GSMA statistic can be computed for each chromosomal region or bin width across the various genome scan studies. Results We provide an Edgeworth approximation to the null distribution of the weighted GSMA statistic, and, we examine the limiting distribution of the GSMA statistics under the order statistic formulation, and quantify the relevance of the pairwise correlations of the GSMA statistics across different bins on this limiting distribution. We also remark on aggregate criteria and multiple testing for determining significance of GSMA results. Conclusion Theoretical considerations detailed herein can lead to clarification and simplification of testing criteria for generalizations of the GSMA statistic. PMID:15717930
A statistical approach to deriving subsystem specifications. [for spacecraft shock and vibrational environment tests

NASA Technical Reports Server (NTRS)

Keegan, W. B.

1974-01-01

In order to produce cost effective environmental test programs, the test specifications must be realistic and to be useful, they must be available early in the life of a program. This paper describes a method for achieving such specifications for subsystems by utilizing the results of a statistical analysis of data acquired at subsystem mounting locations during system level environmental tests. The paper describes the details of this statistical analysis. The resultant recommended levels are a function of the subsystems' mounting location in the spacecraft. Methods of determining this mounting 'zone' are described. Recommendations are then made as to which of the various problem areas encountered should be pursued further.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.

PubMed

Chen, Yanguang

2016-01-01

In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

PubMed

Hu, Yi-Juan; Berndt, Sonja I; Gustafsson, Stefan; Ganna, Andrea; Hirschhorn, Joel; North, Kari E; Ingelsson, Erik; Lin, Dan-Yu

2013-08-08

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
The effect of rare variants on inflation of the test statistics in case-control analyses.

PubMed

Pirie, Ailith; Wood, Angela; Lush, Michael; Tyrer, Jonathan; Pharoah, Paul D P

2015-02-20

The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.
Statistical aspects of genetic association testing in small samples, based on selective DNA pooling data in the arctic fox.

PubMed

Szyda, Joanna; Liu, Zengting; Zatoń-Dobrowolska, Magdalena; Wierzbicki, Heliodor; Rzasa, Anna

2008-01-01

We analysed data from a selective DNA pooling experiment with 130 individuals of the arctic fox (Alopex lagopus), which originated from 2 different types regarding body size. The association between alleles of 6 selected unlinked molecular markers and body size was tested by using univariate and multinomial logistic regression models, applying odds ratio and test statistics from the power divergence family. Due to the small sample size and the resulting sparseness of the data table, in hypothesis testing we could not rely on the asymptotic distributions of the tests. Instead, we tried to account for data sparseness by (i) modifying confidence intervals of odds ratio; (ii) using a normal approximation of the asymptotic distribution of the power divergence tests with different approaches for calculating moments of the statistics; and (iii) assessing P values empirically, based on bootstrap samples. As a result, a significant association was observed for 3 markers. Furthermore, we used simulations to assess the validity of the normal approximation of the asymptotic distribution of the test statistics under the conditions of small and sparse samples.
Heteroscedastic Tests Statistics for One-Way Analysis of Variance: The Trimmed Means and Hall's Transformation Conjunction

ERIC Educational Resources Information Center

Luh, Wei-Ming; Guo, Jiin-Huarng

2005-01-01

To deal with nonnormal and heterogeneous data for the one-way fixed effect analysis of variance model, the authors adopted a trimmed means method in conjunction with Hall's invertible transformation into a heteroscedastic test statistic (Alexander-Govern test or Welch test). The results of simulation experiments showed that the proposed technique…
Tri-Center Analysis: Determining Measures of Trichotomous Central Tendency for the Parametric Analysis of Tri-Squared Test Results

ERIC Educational Resources Information Center

Osler, James Edward

2014-01-01

This monograph provides an epistemological rational for the design of a novel post hoc statistical measure called "Tri-Center Analysis". This new statistic is designed to analyze the post hoc outcomes of the Tri-Squared Test. In Tri-Center Analysis trichotomous parametric inferential parametric statistical measures are calculated from…
Outcomes Definitions and Statistical Tests in Oncology Studies: A Systematic Review of the Reporting Consistency

PubMed Central

Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie

2016-01-01

Background Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. Methods From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. Results 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31–0.89] (P value = 0.009). Conclusion Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies. PMID:27716793
Statistical analysis of an inter-laboratory comparison of small-scale safety and thermal testing of RDX

DOE PAGES

Brown, Geoffrey W.; Sandstrom, Mary M.; Preston, Daniel N.; ...

2014-11-17

In this study, the Integrated Data Collection Analysis (IDCA) program has conducted a proficiency test for small-scale safety and thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results from this test for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Class 5 Type II standard. The material was tested as a well-characterized standard several times during the proficiency test to assess differences among participants and the range of results that may arise for well-behaved explosive materials.
Improving the Crossing-SIBTEST Statistic for Detecting Non-uniform DIF.

PubMed

Chalmers, R Philip

2018-06-01

This paper demonstrates that, after applying a simple modification to Li and Stout's (Psychometrika 61(4):647-677, 1996) CSIBTEST statistic, an improved variant of the statistic could be realized. It is shown that this modified version of CSIBTEST has a more direct association with the SIBTEST statistic presented by Shealy and Stout (Psychometrika 58(2):159-194, 1993). In particular, the asymptotic sampling distributions and general interpretation of the effect size estimates are the same for SIBTEST and the new CSIBTEST. Given the more natural connection to SIBTEST, it is shown that Li and Stout's hypothesis testing approach is insufficient for CSIBTEST; thus, an improved hypothesis testing procedure is required. Based on the presented arguments, a new chi-squared-based hypothesis testing approach is proposed for the modified CSIBTEST statistic. Positive results from a modest Monte Carlo simulation study strongly suggest the original CSIBTEST procedure and randomization hypothesis testing approach should be replaced by the modified statistic and hypothesis testing method.
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.

PubMed

Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H

2015-01-01

Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items.

ERIC Educational Resources Information Center

van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.

2002-01-01

Compared the nominal and empirical null distributions of the standardized log-likelihood statistic for polytomous items for paper-and-pencil (P&P) and computerized adaptive tests (CATs). Results show that the empirical distribution of the statistic differed from the assumed standard normal distribution for both P&P tests and CATs. Also…
Statistical Power in Evaluations That Investigate Effects on Multiple Outcomes: A Guide for Researchers

ERIC Educational Resources Information Center

Porter, Kristin E.

2018-01-01

Researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple testing procedures (MTPs) are statistical…
Effects of different centrifugation conditions on clinical chemistry and Immunology test results

PubMed Central

2011-01-01

Background The effect of centrifugation time of heparinized blood samples on clinical chemistry and immunology results has rarely been studied. WHO guideline proposed a 15 min centrifugation time without citing any scientific publications. The centrifugation time has a considerable impact on the turn-around-time. Methods We investigated 74 parameters in samples from 44 patients on a Roche Cobas 6000 system, to see whether there was a statistical significant difference in the test results among specimens centrifuged at 2180 g for 15 min, at 2180 g for 10 min or at 1870 g for 7 min, respectively. Two tubes with different plasma separators (both Greiner Bio-One) were used for each centrifugation condition. Statistical comparisons were made by Deming fit. Results Tubes with different separators showed identical results in all parameters. Likewise, excellent correlations were found among tubes to which different centrifugation conditions were applied. Fifty percent of the slopes lay between 0.99 and 1.01. Only 3.6 percent of the statistical tests results fell outside the significance level of p < 0.05, which was less than the expected 5%. This suggests that the outliers are the result of random variation and the large number of statistical tests performed. Further, we found that our data are sufficient not to miss a biased test (beta error) with a probability of 0.10 to 0.05 in most parameters. Conclusion A centrifugation time of either 7 or 10 min provided identical test results compared to the time of 15 min as proposed by WHO under the conditions used in our study. PMID:21569233
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression

PubMed Central

Chen, Yanguang

2016-01-01

In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271
A clinicomicrobiological study to evaluate the efficacy of manual and powered toothbrushes among autistic patients

PubMed Central

Vajawat, Mayuri; Deepika, P. C.; Kumar, Vijay; Rajeshwari, P.

2015-01-01

Aim: To compare the efficacy of powered toothbrushes in improving gingival health and reducing salivary red complex counts as compared to manual toothbrushes, among autistic individuals. Materials and Methods: Forty autistics was selected. Test group received powered toothbrushes, and control group received manual toothbrushes. Plaque index and gingival index were recorded. Unstimulated saliva was collected for analysis of red complex organisms using polymerase chain reaction. Results: A statistically significant reduction in the plaque scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.002 for controls). This reduction was statistically more significant in the test group (P = 0.024). A statistically significant reduction in the gingival scores was seen over a period of 12 weeks in both the groups (P < 0.001 for tests and P = 0.001 for controls). This reduction was statistically more significant in the test group (P = 0.042). No statistically significant reduction in the detection rate of red complex organisms were seen at 4 weeks in both the groups. Conclusion: Powered toothbrushes result in a significant overall improvement in gingival health when constant reinforcement of oral hygiene instructions is given. PMID:26681855
Test 6, Test 7, and Gas Standard Analysis Results

NASA Technical Reports Server (NTRS)

Perez, Horacio, III

2007-01-01

This viewgraph presentation shows results of analyses on odor, toxic off gassing and gas standards. The topics include: 1) Statistical Analysis Definitions; 2) Odor Analysis Results NASA Standard 6001 Test 6; 3) Toxic Off gassing Analysis Results NASA Standard 6001 Test 7; and 4) Gas Standard Results NASA Standard 6001 Test 7;

Hypothesis-Testing Demands Trustworthy Data—A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy

PubMed Central

Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank

2018-01-01

In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363
Hypothesis-Testing Demands Trustworthy Data-A Simulation Approach to Inferential Statistics Advocating the Research Program Strategy.

PubMed

Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank

2018-01-01

In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.
Statistical evaluation of the metallurgical test data in the ORR-PSF-PVS irradiation experiment. [PWR; BWR

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stallmann, F.W.

1984-08-01

A statistical analysis of Charpy test results of the two-year Pressure Vessel Simulation metallurgical irradiation experiment was performed. Determination of transition temperature and upper shelf energy derived from computer fits compare well with eyeball fits. Uncertainties for all results can be obtained with computer fits. The results were compared with predictions in Regulatory Guide 1.99 and other irradiation damage models.
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS)

PubMed Central

Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M.; Khan, Ajmal

2016-01-01

Objective: This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Methods: Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. Results: A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher’s exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher’s exact test, logistic regression, epidemiological statistics, and non-parametric tests. Conclusion: This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design. PMID:27022365
Learning and understanding the Kruskal-Wallis one-way analysis-of-variance-by-ranks test for differences among three or more independent groups.

PubMed

Chan, Y; Walmsley, R P

1997-12-01

When several treatment methods are available for the same problem, many clinicians are faced with the task of deciding which treatment to use. Many clinicians may have conducted informal "mini-experiments" on their own to determine which treatment is best suited for the problem. These results are usually not documented or reported in a formal manner because many clinicians feel that they are "statistically challenged." Another reason may be because clinicians do not feel they have controlled enough test conditions to warrant analysis. In this update, a statistic is described that does not involve complicated statistical assumptions, making it a simple and easy-to-use statistical method. This update examines the use of two statistics and does not deal with other issues that could affect clinical research such as issues affecting credibility. For readers who want a more in-depth examination of this topic, references have been provided. The Kruskal-Wallis one-way analysis-of-variance-by-ranks test (or H test) is used to determine whether three or more independent groups are the same or different on some variable of interest when an ordinal level of data or an interval or ratio level of data is available. A hypothetical example will be presented to explain when and how to use this statistic, how to interpret results using the statistic, the advantages and disadvantages of the statistic, and what to look for in a written report. This hypothetical example will involve the use of ratio data to demonstrate how to choose between using the nonparametric H test and the more powerful parametric F test.
Cognitive predictors of balance in Parkinson's disease.

PubMed

Fernandes, Ângela; Mendes, Andreia; Rocha, Nuno; Tavares, João Manuel R S

2016-06-01

Postural instability is one of the most incapacitating symptoms of Parkinson's disease (PD) and appears to be related to cognitive deficits. This study aims to determine the cognitive factors that can predict deficits in static and dynamic balance in individuals with PD. A sociodemographic questionnaire characterized 52 individuals with PD for this work. The Trail Making Test, Rule Shift Cards Test, and Digit Span Test assessed the executive functions. The static balance was assessed using a plantar pressure platform, and dynamic balance was based on the Timed Up and Go Test. The results were statistically analysed using SPSS Statistics software through linear regression analysis. The results show that a statistically significant model based on cognitive outcomes was able to explain the variance of motor variables. Also, the explanatory value of the model tended to increase with the addition of individual and clinical variables, although the resulting model was not statistically significant The model explained 25-29% of the variability of the Timed Up and Go Test, while for the anteroposterior displacement it was 23-34%, and for the mediolateral displacement it was 24-39%. From the findings, we conclude that the cognitive performance, especially the executive functions, is a predictor of balance deficit in individuals with PD.
Effects of different centrifugation conditions on clinical chemistry and Immunology test results.

PubMed

Minder, Elisabeth I; Schibli, Adrian; Mahrer, Dagmar; Nesic, Predrag; Plüer, Kathrin

2011-05-10

The effect of centrifugation time of heparinized blood samples on clinical chemistry and immunology results has rarely been studied. WHO guideline proposed a 15 min centrifugation time without citing any scientific publications. The centrifugation time has a considerable impact on the turn-around-time. We investigated 74 parameters in samples from 44 patients on a Roche Cobas 6000 system, to see whether there was a statistical significant difference in the test results among specimens centrifuged at 2180 g for 15 min, at 2180 g for 10 min or at 1870 g for 7 min, respectively. Two tubes with different plasma separators (both Greiner Bio-One) were used for each centrifugation condition. Statistical comparisons were made by Deming fit. Tubes with different separators showed identical results in all parameters. Likewise, excellent correlations were found among tubes to which different centrifugation conditions were applied. Fifty percent of the slopes lay between 0.99 and 1.01. Only 3.6 percent of the statistical tests results fell outside the significance level of p < 0.05, which was less than the expected 5%. This suggests that the outliers are the result of random variation and the large number of statistical tests performed. Further, we found that our data are sufficient not to miss a biased test (beta error) with a probability of 0.10 to 0.05 in most parameters. A centrifugation time of either 7 or 10 min provided identical test results compared to the time of 15 min as proposed by WHO under the conditions used in our study.
Usefulness and limitations of various guinea-pig test methods in detecting human skin sensitizers-validation of guinea-pig tests for skin hypersensitivity.

PubMed

Marzulli, F; Maguire, H C

1982-02-01

Several guinea-pig predictive test methods were evaluated by comparison of results with those obtained with human predictive tests, using ten compounds that have been used in cosmetics. The method involves the statistical analysis of the frequency with which guinea-pig tests agree with the findings of tests in humans. In addition, the frequencies of false positive and false negative predictive findings are considered and statistically analysed. The results clearly demonstrate the superiority of adjuvant tests (complete Freund's adjuvant) in determining skin sensitizers and the overall superiority of the guinea-pig maximization test in providing results similar to those obtained by human testing. A procedure is suggested for utilizing adjuvant and non-adjuvant test methods for characterizing compounds as of weak, moderate or strong sensitizing potential.
A Hybrid Template-Based Composite Classification System

DTIC Science & Technology

2009-02-01

Hybrid Classifier: Forced Decision . . . . 116 5.3.2 Forced Decision Experimental Results . . . . . 119 5.3.3 Test for Statistical Significance ...Results . . . . . . . . . . 127 5.4.2 Test for Statistical Significance : NDEC Option 129 5.5 Implementing the Hyrid Classifier with OOL Targets . 130...comple- mentary in nature . Complementary classifiers are observed by finding an optimal method for partitioning the problem space. For example, the
Visualizing the Bayesian 2-test case: The effect of tree diagrams on medical decision making.

PubMed

Binder, Karin; Krauss, Stefan; Bruckmaier, Georg; Marienhagen, Jörg

2018-01-01

In medicine, diagnoses based on medical test results are probabilistic by nature. Unfortunately, cognitive illusions regarding the statistical meaning of test results are well documented among patients, medical students, and even physicians. There are two effective strategies that can foster insight into what is known as Bayesian reasoning situations: (1) translating the statistical information on the prevalence of a disease and the sensitivity and the false-alarm rate of a specific test for that disease from probabilities into natural frequencies, and (2) illustrating the statistical information with tree diagrams, for instance, or with other pictorial representation. So far, such strategies have only been empirically tested in combination for "1-test cases", where one binary hypothesis ("disease" vs. "no disease") has to be diagnosed based on one binary test result ("positive" vs. "negative"). However, in reality, often more than one medical test is conducted to derive a diagnosis. In two studies, we examined a total of 388 medical students from the University of Regensburg (Germany) with medical "2-test scenarios". Each student had to work on two problems: diagnosing breast cancer with mammography and sonography test results, and diagnosing HIV infection with the ELISA and Western Blot tests. In Study 1 (N = 190 participants), we systematically varied the presentation of statistical information ("only textual information" vs. "only tree diagram" vs. "text and tree diagram in combination"), whereas in Study 2 (N = 198 participants), we varied the kinds of tree diagrams ("complete tree" vs. "highlighted tree" vs. "pruned tree"). All versions were implemented in probability format (including probability trees) and in natural frequency format (including frequency trees). We found that natural frequency trees, especially when the question-related branches were highlighted, improved performance, but that none of the corresponding probabilistic visualizations did.
Mathematics pre-service teachers’ statistical reasoning about meaning

NASA Astrophysics Data System (ADS)

Kristanto, Y. D.

2018-01-01

This article offers a descriptive qualitative analysis of 3 second-year pre-service teachers’ statistical reasoning about the mean. Twenty-six pre-service teachers were tested using an open-ended problem where they were expected to analyze a method in finding the mean of a data. Three of their test results are selected to be analyzed. The results suggest that the pre-service teachers did not use context to develop the interpretation of mean. Therefore, this article also offers strategies to promote statistical reasoning about mean that use various contexts.
Comparison between the Laser-Badal and Vernier Optometers.

DTIC Science & Technology

1988-09-01

naval aviators (SNAs). We also measured dark vcrgence in the same sample of SNAs. THE FINDINGS There was no statistically significant difference found...relatively inexperienced operator. 7. The difference between mean scores on the vernier and laser-Badal optometers was statistically significant...thus indicating that test results were reliable within instru- menrts. TAbLE 1. Test and Retest Statistics . Measure Mean SD n t-value Dark vergence
Wavelet analysis in ecology and epidemiology: impact of statistical tests

PubMed Central

Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario

2014-01-01

Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the ‘beta-surrogate’ method. PMID:24284892
Wavelet analysis in ecology and epidemiology: impact of statistical tests.

PubMed

Cazelles, Bernard; Cazelles, Kévin; Chavez, Mario

2014-02-06

Wavelet analysis is now frequently used to extract information from ecological and epidemiological time series. Statistical hypothesis tests are conducted on associated wavelet quantities to assess the likelihood that they are due to a random process. Such random processes represent null models and are generally based on synthetic data that share some statistical characteristics with the original time series. This allows the comparison of null statistics with those obtained from original time series. When creating synthetic datasets, different techniques of resampling result in different characteristics shared by the synthetic time series. Therefore, it becomes crucial to consider the impact of the resampling method on the results. We have addressed this point by comparing seven different statistical testing methods applied with different real and simulated data. Our results show that statistical assessment of periodic patterns is strongly affected by the choice of the resampling method, so two different resampling techniques could lead to two different conclusions about the same time series. Moreover, our results clearly show the inadequacy of resampling series generated by white noise and red noise that are nevertheless the methods currently used in the wide majority of wavelets applications. Our results highlight that the characteristics of a time series, namely its Fourier spectrum and autocorrelation, are important to consider when choosing the resampling technique. Results suggest that data-driven resampling methods should be used such as the hidden Markov model algorithm and the 'beta-surrogate' method.
Use of the Global Test Statistic as a Performance Measurement in a Reananlysis of Environmental Health Data

PubMed Central

Dymova, Natalya; Hanumara, R. Choudary; Gagnon, Ronald N.

2009-01-01

Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies. PMID:19696393
Use of the global test statistic as a performance measurement in a reanalysis of environmental health data.

PubMed

Dymova, Natalya; Hanumara, R Choudary; Enander, Richard T; Gagnon, Ronald N

2009-10-01

Performance measurement is increasingly viewed as an essential component of environmental and public health protection programs. In characterizing program performance over time, investigators often observe multiple changes resulting from a single intervention across a range of categories. Although a variety of statistical tools allow evaluation of data one variable at a time, the global test statistic is uniquely suited for analyses of categories or groups of interrelated variables. Here we demonstrate how the global test statistic can be applied to environmental and occupational health data for the purpose of making overall statements on the success of targeted intervention strategies.
Reporting Practices and Use of Quantitative Methods in Canadian Journal Articles in Psychology.

PubMed

Counsell, Alyssa; Harlow, Lisa L

2017-05-01

With recent focus on the state of research in psychology, it is essential to assess the nature of the statistical methods and analyses used and reported by psychological researchers. To that end, we investigated the prevalence of different statistical procedures and the nature of statistical reporting practices in recent articles from the four major Canadian psychology journals. The majority of authors evaluated their research hypotheses through the use of analysis of variance (ANOVA), t -tests, and multiple regression. Multivariate approaches were less common. Null hypothesis significance testing remains a popular strategy, but the majority of authors reported a standardized or unstandardized effect size measure alongside their significance test results. Confidence intervals on effect sizes were infrequently employed. Many authors provided minimal details about their statistical analyses and less than a third of the articles presented on data complications such as missing data and violations of statistical assumptions. Strengths of and areas needing improvement for reporting quantitative results are highlighted. The paper concludes with recommendations for how researchers and reviewers can improve comprehension and transparency in statistical reporting.
Improved Test Planning and Analysis Through the Use of Advanced Statistical Methods

NASA Technical Reports Server (NTRS)

Green, Lawrence L.; Maxwell, Katherine A.; Glass, David E.; Vaughn, Wallace L.; Barger, Weston; Cook, Mylan

2016-01-01

The goal of this work is, through computational simulations, to provide statistically-based evidence to convince the testing community that a distributed testing approach is superior to a clustered testing approach for most situations. For clustered testing, numerous, repeated test points are acquired at a limited number of test conditions. For distributed testing, only one or a few test points are requested at many different conditions. The statistical techniques of Analysis of Variance (ANOVA), Design of Experiments (DOE) and Response Surface Methods (RSM) are applied to enable distributed test planning, data analysis and test augmentation. The D-Optimal class of DOE is used to plan an optimally efficient single- and multi-factor test. The resulting simulated test data are analyzed via ANOVA and a parametric model is constructed using RSM. Finally, ANOVA can be used to plan a second round of testing to augment the existing data set with new data points. The use of these techniques is demonstrated through several illustrative examples. To date, many thousands of comparisons have been performed and the results strongly support the conclusion that the distributed testing approach outperforms the clustered testing approach.
Outcomes Definitions and Statistical Tests in Oncology Studies: A Systematic Review of the Reporting Consistency.

PubMed

Rivoirard, Romain; Duplay, Vianney; Oriol, Mathieu; Tinquaut, Fabien; Chauvin, Franck; Magne, Nicolas; Bourmaud, Aurelie

2016-01-01

Quality of reporting for Randomized Clinical Trials (RCTs) in oncology was analyzed in several systematic reviews, but, in this setting, there is paucity of data for the outcomes definitions and consistency of reporting for statistical tests in RCTs and Observational Studies (OBS). The objective of this review was to describe those two reporting aspects, for OBS and RCTs in oncology. From a list of 19 medical journals, three were retained for analysis, after a random selection: British Medical Journal (BMJ), Annals of Oncology (AoO) and British Journal of Cancer (BJC). All original articles published between March 2009 and March 2014 were screened. Only studies whose main outcome was accompanied by a corresponding statistical test were included in the analysis. Studies based on censored data were excluded. Primary outcome was to assess quality of reporting for description of primary outcome measure in RCTs and of variables of interest in OBS. A logistic regression was performed to identify covariates of studies potentially associated with concordance of tests between Methods and Results parts. 826 studies were included in the review, and 698 were OBS. Variables were described in Methods section for all OBS studies and primary endpoint was clearly detailed in Methods section for 109 RCTs (85.2%). 295 OBS (42.2%) and 43 RCTs (33.6%) had perfect agreement for reported statistical test between Methods and Results parts. In multivariable analysis, variable "number of included patients in study" was associated with test consistency: aOR (adjusted Odds Ratio) for third group compared to first group was equal to: aOR Grp3 = 0.52 [0.31-0.89] (P value = 0.009). Variables in OBS and primary endpoint in RCTs are reported and described with a high frequency. However, statistical tests consistency between methods and Results sections of OBS is not always noted. Therefore, we encourage authors and peer reviewers to verify consistency of statistical tests in oncology studies.
Relationship between sitting volleyball performance and field fitness of sitting volleyball players in Korea

PubMed Central

Jeoung, Bogja

2017-01-01

The purpose of this study was to evaluate the relationship between sitting volleyball performance and the field fitness of sitting volleyball players. Forty-five elite sitting volleyball players participated in 10 field fitness tests. Additionally, the players’ head coach and coach assessed their volleyball performance (receive and defense, block, attack, and serve). Data were analyzed with SPSS software version 21 by using correlation and regression analyses, and the significance level was set at P< 0.05. The results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, splint, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on the players’ abilities to attack, serve, and block. Grip strength, t-test, speed, and agility showed a statistically significant relationship with the players’ skill at defense and receive. Our results showed that chest pass, overhand throw, one-hand throw, one-hand side throw, speed endurance, reaction time, and graded exercise test results had a statistically significant influence on volleyball performance. PMID:29326896

Determining Differences in Efficacy of Two Disinfectants Using t-Tests.

ERIC Educational Resources Information Center

Brehm, Michael A.; And Others

1996-01-01

Presents an experiment to compare the effectiveness of 95% ethanol to 20% bleach as disinfectants using t-tests for the statistical analysis of the data. Reports that bleach is a better disinfectant. Discusses the statistical and practical significance of the results. (JRH)
Effect of non-normality on test statistics for one-way independent groups designs.

PubMed

Cribbie, Robert A; Fiksenbaum, Lisa; Keselman, H J; Wilcox, Rand R

2012-02-01

The data obtained from one-way independent groups designs is typically non-normal in form and rarely equally variable across treatment populations (i.e., population variances are heterogeneous). Consequently, the classical test statistic that is used to assess statistical significance (i.e., the analysis of variance F test) typically provides invalid results (e.g., too many Type I errors, reduced power). For this reason, there has been considerable interest in finding a test statistic that is appropriate under conditions of non-normality and variance heterogeneity. Previously recommended procedures for analysing such data include the James test, the Welch test applied either to the usual least squares estimators of central tendency and variability, or the Welch test with robust estimators (i.e., trimmed means and Winsorized variances). A new statistic proposed by Krishnamoorthy, Lu, and Mathew, intended to deal with heterogeneous variances, though not non-normality, uses a parametric bootstrap procedure. In their investigation of the parametric bootstrap test, the authors examined its operating characteristics under limited conditions and did not compare it to the Welch test based on robust estimators. Thus, we investigated how the parametric bootstrap procedure and a modified parametric bootstrap procedure based on trimmed means perform relative to previously recommended procedures when data are non-normal and heterogeneous. The results indicated that the tests based on trimmed means offer the best Type I error control and power when variances are unequal and at least some of the distribution shapes are non-normal. © 2011 The British Psychological Society.
Robust multivariate nonparametric tests for detection of two-sample location shift in clinical trials

PubMed Central

Jiang, Xuejun; Guo, Xu; Zhang, Ning; Wang, Bo

2018-01-01

This article presents and investigates performance of a series of robust multivariate nonparametric tests for detection of location shift between two multivariate samples in randomized controlled trials. The tests are built upon robust estimators of distribution locations (medians, Hodges-Lehmann estimators, and an extended U statistic) with both unscaled and scaled versions. The nonparametric tests are robust to outliers and do not assume that the two samples are drawn from multivariate normal distributions. Bootstrap and permutation approaches are introduced for determining the p-values of the proposed test statistics. Simulation studies are conducted and numerical results are reported to examine performance of the proposed statistical tests. The numerical results demonstrate that the robust multivariate nonparametric tests constructed from the Hodges-Lehmann estimators are more efficient than those based on medians and the extended U statistic. The permutation approach can provide a more stringent control of Type I error and is generally more powerful than the bootstrap procedure. The proposed robust nonparametric tests are applied to detect multivariate distributional difference between the intervention and control groups in the Thai Healthy Choices study and examine the intervention effect of a four-session motivational interviewing-based intervention developed in the study to reduce risk behaviors among youth living with HIV. PMID:29672555
Statistical significance test for transition matrices of atmospheric Markov chains

NASA Technical Reports Server (NTRS)

Vautard, Robert; Mo, Kingtse C.; Ghil, Michael

1990-01-01

Low-frequency variability of large-scale atmospheric dynamics can be represented schematically by a Markov chain of multiple flow regimes. This Markov chain contains useful information for the long-range forecaster, provided that the statistical significance of the associated transition matrix can be reliably tested. Monte Carlo simulation yields a very reliable significance test for the elements of this matrix. The results of this test agree with previously used empirical formulae when each cluster of maps identified as a distinct flow regime is sufficiently large and when they all contain a comparable number of maps. Monte Carlo simulation provides a more reliable way to test the statistical significance of transitions to and from small clusters. It can determine the most likely transitions, as well as the most unlikely ones, with a prescribed level of statistical significance.
Planck 2015 results. XVI. Isotropy and statistics of the CMB

NASA Astrophysics Data System (ADS)

Planck Collaboration; Ade, P. A. R.; Aghanim, N.; Akrami, Y.; Aluri, P. K.; Arnaud, M.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Basak, S.; Battaner, E.; Benabed, K.; Benoît, A.; Benoit-Lévy, A.; Bernard, J.-P.; Bersanelli, M.; Bielewicz, P.; Bock, J. J.; Bonaldi, A.; Bonavera, L.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Boulanger, F.; Bucher, M.; Burigana, C.; Butler, R. C.; Calabrese, E.; Cardoso, J.-F.; Casaponsa, B.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, H. C.; Christensen, P. R.; Church, S.; Clements, D. L.; Colombi, S.; Colombo, L. P. L.; Combet, C.; Contreras, D.; Couchot, F.; Coulais, A.; Crill, B. P.; Cruz, M.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R. D.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Donzelli, S.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Efstathiou, G.; Elsner, F.; Enßlin, T. A.; Eriksen, H. K.; Fantaye, Y.; Fergusson, J.; Fernandez-Cobos, R.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Frejsel, A.; Frolov, A.; Galeotta, S.; Galli, S.; Ganga, K.; Gauthier, C.; Ghosh, T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Hanson, D.; Harrison, D. L.; Henrot-Versillé, S.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Holmes, W. A.; Hornstrup, A.; Hovest, W.; Huang, Z.; Huffenberger, K. M.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Juvela, M.; Keihänen, E.; Keskitalo, R.; Kim, J.; Kisner, T. S.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Lesgourgues, J.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; Liu, H.; López-Caniego, M.; Lubin, P. M.; Macías-Pérez, J. F.; Maggio, G.; Maino, D.; Mandolesi, N.; Mangilli, A.; Marinucci, D.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; McGehee, P.; Meinhold, P. R.; Melchiorri, A.; Mendes, L.; Mennella, A.; Migliaccio, M.; Mikkelsen, K.; Mitra, S.; Miville-Deschênes, M.-A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C. B.; Nørgaard-Nielsen, H. U.; Noviello, F.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Paci, F.; Pagano, L.; Pajot, F.; Pant, N.; Paoletti, D.; Pasian, F.; Patanchon, G.; Pearson, T. J.; Perdereau, O.; Perotto, L.; Perrotta, F.; Pettorino, V.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Popa, L.; Pratt, G. W.; Prézeau, G.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Rotti, A.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Sandri, M.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Seiffert, M. D.; Shellard, E. P. S.; Souradeep, T.; Spencer, L. D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Trombetti, T.; Tucci, M.; Tuovinen, J.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Vielva, P.; Villa, F.; Wade, L. A.; Wandelt, B. D.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zibin, J. P.; Zonca, A.

2016-09-01

We test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect our studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The "Cold Spot" is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.
Planck 2015 results: XVI. Isotropy and statistics of the CMB

DOE PAGES

Ade, P. A. R.; Aghanim, N.; Akrami, Y.; ...

2016-09-20

In this paper, we test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect ourmore » studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The “Cold Spot” is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Finally, where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.« less
The Thurgood Marshall School of Law Empirical Findings: A Report of the Watson-Glaser for the 2009-2010 Test Takers

ERIC Educational Resources Information Center

Kadhi, T.; Palasota, A.; Holley, D.; Rudley, D.

2010-01-01

The following report gives the statistical findings of the 2009-2010 Watson-Glaser test. Data is pre-existing and was given to the Evaluator by email from the Director, Center for Legal Pedagogy. Statistical analyses were run using SPSS 17 to address the following questions: 1. What are the statistical descriptors of the Watson-Glaser results of…
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test

PubMed Central

2013-01-01

Background The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. Results One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. Conclusion We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known. PMID:24199751
Cycom 977-2 Composite Material: Impact Test Results (workshop presentation)

NASA Technical Reports Server (NTRS)

Engle, Carl; Herald, Stephen; Watkins, Casey

2005-01-01

Contents include the following: Ambient (13A) tests of Cycom 977-2 impact characteristics by the Brucenton and statistical method at MSFC and WSTF. Repeat (13A) tests of tested Cycom from phase I at MSFC to expended testing statistical database. Conduct high-pressure tests (13B) in liquid oxygen (LOX) and GOX at MSFC and WSTF to determine Cycom reaction characteristics and batch effect. Conduct expended ambient (13A) LOX test at MSFC and high-pressure (13B) testing to determine pressure effects in LOX. Expend 13B GOX database.
An Analysis of Operational Suitability for Test and Evaluation of Highly Reliable Systems

DTIC Science & Technology

1994-03-04

Exposition," Journal of the American Statistical A iation-59: 353-375 (June 1964). 17. SYS 229, Test and Evaluation Management Coursebook , School of Systems...in hours, 0 is 2-5 the desired MTBCF in hours, R is the number of critical failures, and a is the P[type-I error] of the X2 statistic with 2*R+2...design of experiments (DOE) tables and the use of Bayesian statistics to increase the confidence level of the test results that will be obtained from
Statistical Tutorial | Center for Cancer Research

Cancer.gov

Recent advances in cancer biology have resulted in the need for increased statistical analysis of research data. ST is designed as a follow up to Statistical Analysis of Research Data (SARD) held in April 2018. The tutorial will apply the general principles of statistical analysis of research data including descriptive statistics, z- and t-tests of means and mean
Evaluating Structural Equation Models for Categorical Outcomes: A New Test Statistic and a Practical Challenge of Interpretation.

PubMed

Monroe, Scott; Cai, Li

2015-01-01

This research is concerned with two topics in assessing model fit for categorical data analysis. The first topic involves the application of a limited-information overall test, introduced in the item response theory literature, to structural equation modeling (SEM) of categorical outcome variables. Most popular SEM test statistics assess how well the model reproduces estimated polychoric correlations. In contrast, limited-information test statistics assess how well the underlying categorical data are reproduced. Here, the recently introduced C2 statistic of Cai and Monroe (2014) is applied. The second topic concerns how the root mean square error of approximation (RMSEA) fit index can be affected by the number of categories in the outcome variable. This relationship creates challenges for interpreting RMSEA. While the two topics initially appear unrelated, they may conveniently be studied in tandem since RMSEA is based on an overall test statistic, such as C2. The results are illustrated with an empirical application to data from a large-scale educational survey.
An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

PubMed

Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

2002-06-01

An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and specificity equal to approximately 1) indicate lack of predictive capacity, an NTM having these performance characteristics should be considered no better for predicting toxicity than by chance alone.
A shift from significance test to hypothesis test through power analysis in medical research.

PubMed

Singh, G

2006-01-01

Medical research literature until recently, exhibited substantial dominance of the Fisher's significance test approach of statistical inference concentrating more on probability of type I error over Neyman-Pearson's hypothesis test considering both probability of type I and II error. Fisher's approach dichotomises results into significant or not significant results with a P value. The Neyman-Pearson's approach talks of acceptance or rejection of null hypothesis. Based on the same theory these two approaches deal with same objective and conclude in their own way. The advancement in computing techniques and availability of statistical software have resulted in increasing application of power calculations in medical research and thereby reporting the result of significance tests in the light of power of the test also. Significance test approach, when it incorporates power analysis contains the essence of hypothesis test approach. It may be safely argued that rising application of power analysis in medical research may have initiated a shift from Fisher's significance test to Neyman-Pearson's hypothesis test procedure.
Rolling-Element Fatigue Testing and Data Analysis - A Tutorial

NASA Technical Reports Server (NTRS)

Vlcek, Brian L.; Zaretsky, Erwin V.

2011-01-01

In order to rank bearing materials, lubricants and other design variables using rolling-element bench type fatigue testing of bearing components and full-scale rolling-element bearing tests, the investigator needs to be cognizant of the variables that affect rolling-element fatigue life and be able to maintain and control them within an acceptable experimental tolerance. Once these variables are controlled, the number of tests and the test conditions must be specified to assure reasonable statistical certainty of the final results. There is a reasonable correlation between the results from elemental test rigs with those results obtained with full-scale bearings. Using the statistical methods of W. Weibull and L. Johnson, the minimum number of tests required can be determined. This paper brings together and discusses the technical aspects of rolling-element fatigue testing and data analysis as well as making recommendations to assure quality and reliable testing of rolling-element specimens and full-scale rolling-element bearings.
A general statistical test for correlations in a finite-length time series.

PubMed

Hanson, Jeffery A; Yang, Haw

2008-06-07

The statistical properties of the autocorrelation function from a time series composed of independently and identically distributed stochastic variables has been studied. Analytical expressions for the autocorrelation function's variance have been derived. It has been found that two common ways of calculating the autocorrelation, moving-average and Fourier transform, exhibit different uncertainty characteristics. For periodic time series, the Fourier transform method is preferred because it gives smaller uncertainties that are uniform through all time lags. Based on these analytical results, a statistically robust method has been proposed to test the existence of correlations in a time series. The statistical test is verified by computer simulations and an application to single-molecule fluorescence spectroscopy is discussed.
Cognitive Stimulation of Elderly Residents in Social Protection Centers in Cartagena, 2014.

PubMed

Melguizo Herrera, Estela; Bertel De La Hoz, Anyel; Paternina Osorio, Diego; Felfle Fuentes, Yurani; Porto Osorio, Leidy

To determine the effectiveness of a program of cognitive stimulation of the elderly residents in Social Protection Centers in Cartagena, 2014. Quasi-experimental study with pre and post tests in control and experimental groups. A sample of 37 elderly residents in Social Protection Centers participated: 23 in the experimental group and 14 in the control group. A survey and a mental evaluation test (Pfeiffer) were applied. The experimental group participated in 10 sessions of cognitive stimulation. The paired t-test showed statistically significant differences in the Pfeiffer test, pre and post intervention, compared to the experimental group (P=.0005). The unpaired t-test showed statistically significant differences in Pfeiffer test results to the experimental and control groups (P=.0450). The analysis of the main components showed that more interrelated variables were: age, diseases, number of errors and test results; which were grouped around the disease variable, with a negative association. The intervention demonstrated a statistically significant improvement in cognitive functionality of the elderly. Nursing can lead this type of intervention. It should be studied further to strengthen and clarify these results. Copyright © 2016 Asociación Colombiana de Psiquiatría. Publicado por Elsevier España. All rights reserved.
A basic introduction to statistics for the orthopaedic surgeon.

PubMed

Bertrand, Catherine; Van Riet, Roger; Verstreken, Frederik; Michielsen, Jef

2012-02-01

Orthopaedic surgeons should review the orthopaedic literature in order to keep pace with the latest insights and practices. A good understanding of basic statistical principles is of crucial importance to the ability to read articles critically, to interpret results and to arrive at correct conclusions. This paper explains some of the key concepts in statistics, including hypothesis testing, Type I and Type II errors, testing of normality, sample size and p values.
Descriptive and inferential statistical methods used in burns research.

PubMed

Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars

2010-05-01

Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals in the fields of biostatistics and epidemiology when using more advanced statistical techniques. Copyright 2009 Elsevier Ltd and ISBI. All rights reserved.
Why Current Statistics of Complementary Alternative Medicine Clinical Trials is Invalid.

PubMed

Pandolfi, Maurizio; Carreras, Giulia

2018-06-07

It is not sufficiently known that frequentist statistics cannot provide direct information on the probability that the research hypothesis tested is correct. The error resulting from this misunderstanding is compounded when the hypotheses under scrutiny have precarious scientific bases, which, generally, those of complementary alternative medicine (CAM) are. In such cases, it is mandatory to use inferential statistics, considering the prior probability that the hypothesis tested is true, such as the Bayesian statistics. The authors show that, under such circumstances, no real statistical significance can be achieved in CAM clinical trials. In this respect, CAM trials involving human material are also hardly defensible from an ethical viewpoint.

Explanation of Two Anomalous Results in Statistical Mediation Analysis.

PubMed

Fritz, Matthew S; Taylor, Aaron B; Mackinnon, David P

2012-01-01

Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special concern as the bias-corrected bootstrap is often recommended and used due to its higher statistical power compared with other tests. The second result is statistical power reaching an asymptote far below 1.0 and in some conditions even declining slightly as the size of the relationship between X and M , a , increased. Two computer simulations were conducted to examine these findings in greater detail. Results from the first simulation found that the increased Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap are a function of an interaction between the size of the individual paths making up the mediated effect and the sample size, such that elevated Type I error rates occur when the sample size is small and the effect size of the nonzero path is medium or larger. Results from the second simulation found that stagnation and decreases in statistical power as a function of the effect size of the a path occurred primarily when the path between M and Y , b , was small. Two empirical mediation examples are provided using data from a steroid prevention and health promotion program aimed at high school football players (Athletes Training and Learning to Avoid Steroids; Goldberg et al., 1996), one to illustrate a possible Type I error for the bias-corrected bootstrap test and a second to illustrate a loss in power related to the size of a . Implications of these findings are discussed.
41 CFR 105-50.202-1 - Copies of statistical or other studies.

Code of Federal Regulations, 2011 CFR

2011-01-01

... 41 Public Contracts and Property Management 3 2011-01-01 2011-01-01 false Copies of statistical or... Services Administration § 105-50.202-1 Copies of statistical or other studies. This material includes a copy of any existing statistical or other studies and compilations, results of technical tests and...
41 CFR 105-50.202-1 - Copies of statistical or other studies.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 41 Public Contracts and Property Management 3 2010-07-01 2010-07-01 false Copies of statistical or... Services Administration § 105-50.202-1 Copies of statistical or other studies. This material includes a copy of any existing statistical or other studies and compilations, results of technical tests and...
[Clinical research IV. Relevancy of the statistical test chosen].

PubMed

Talavera, Juan O; Rivas-Ruiz, Rodolfo

2011-01-01

When we look at the difference between two therapies or the association of a risk factor or prognostic indicator with its outcome, we need to evaluate the accuracy of the result. This assessment is based on a judgment that uses information about the study design and statistical management of the information. This paper specifically mentions the relevance of the statistical test selected. Statistical tests are chosen mainly from two characteristics: the objective of the study and type of variables. The objective can be divided into three test groups: a) those in which you want to show differences between groups or inside a group before and after a maneuver, b) those that seek to show the relationship (correlation) between variables, and c) those that aim to predict an outcome. The types of variables are divided in two: quantitative (continuous and discontinuous) and qualitative (ordinal and dichotomous). For example, if we seek to demonstrate differences in age (quantitative variable) among patients with systemic lupus erythematosus (SLE) with and without neurological disease (two groups), the appropriate test is the "Student t test for independent samples." But if the comparison is about the frequency of females (binomial variable), then the appropriate statistical test is the χ(2).
Nurse Education, Center of Excellence for Remote and Medically Under-Served Areas (CERMUSA)

DTIC Science & Technology

2014-04-01

didactic portion o Online pre-test and post-test o Online survey • Statistical validity: o The study investigators enrolled 134 participants...NURSE.SAS Datacut: 2014-01-17 Generated: 2014-01-26:16:40 Table 2.09: Summary Statistics , Does your curriculum teach students how to develop a personal...disasters. Pre-test post-test results indicated that the delivery of didactic material via an online course management system is an effective
A powerful approach for association analysis incorporating imprinting effects

PubMed Central

Xia, Fan; Zhou, Ji-Yuan; Fung, Wing Kam

2011-01-01

Motivation: For a diallelic marker locus, the transmission disequilibrium test (TDT) is a simple and powerful design for genetic studies. The TDT was originally proposed for use in families with both parents available (complete nuclear families) and has further been extended to 1-TDT for use in families with only one of the parents available (incomplete nuclear families). Currently, the increasing interest of the influence of parental imprinting on heritability indicates the importance of incorporating imprinting effects into the mapping of association variants. Results: In this article, we extend the TDT-type statistics to incorporate imprinting effects and develop a series of new test statistics in a general two-stage framework for association studies. Our test statistics enjoy the nature of family-based designs that need no assumption of Hardy–Weinberg equilibrium. Also, the proposed methods accommodate complete and incomplete nuclear families with one or more affected children. In the simulation study, we verify the validity of the proposed test statistics under various scenarios, and compare the powers of the proposed statistics with some existing test statistics. It is shown that our methods greatly improve the power for detecting association in the presence of imprinting effects. We further demonstrate the advantage of our methods by the application of the proposed test statistics to a rheumatoid arthritis dataset. Contact: wingfung@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21798962
Explanation of Two Anomalous Results in Statistical Mediation Analysis

ERIC Educational Resources Information Center

Fritz, Matthew S.; Taylor, Aaron B.; MacKinnon, David P.

2012-01-01

Previous studies of different methods of testing mediation models have consistently found two anomalous results. The first result is elevated Type I error rates for the bias-corrected and accelerated bias-corrected bootstrap tests not found in nonresampling tests or in resampling tests that did not include a bias correction. This is of special…
Exploring the Replicability of a Study's Results: Bootstrap Statistics for the Multivariate Case.

ERIC Educational Resources Information Center

Thompson, Bruce

Conventional statistical significance tests do not inform the researcher regarding the likelihood that results will replicate. One strategy for evaluating result replication is to use a "bootstrap" resampling of a study's data so that the stability of results across numerous configurations of the subjects can be explored. This paper…
Statistical characterization of the fatigue behavior of composite lamina

NASA Technical Reports Server (NTRS)

Yang, J. N.; Jones, D. L.

1979-01-01

A theoretical model was developed to predict statistically the effects of constant and variable amplitude fatigue loadings on the residual strength and fatigue life of composite lamina. The parameters in the model were established from the results of a series of static tensile tests and a fatigue scan and a number of verification tests were performed. Abstracts for two other papers on the effect of load sequence on the statistical fatigue of composites are also presented.
Validation of a modification to Performance-Tested Method 070601: Reveal Listeria Test for detection of Listeria spp. in selected foods and selected environmental samples.

PubMed

Alles, Susan; Peng, Linda X; Mozola, Mark A

2009-01-01

A modification to Performance-Tested Method (PTM) 070601, Reveal Listeria Test (Reveal), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there was a statistically significant difference in performance between the Reveal and reference culture [U.S. Food and Drug Administration's Bacteriological Analytical Manual (FDA/BAM) or U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS)] methods for only a single food in one trial (pasteurized crab meat) at the 27 h enrichment time point, with more positive results obtained with the FDA/BAM reference method. No foods showed statistically significant differences in method performance at the 30 h time point. Independent laboratory testing of 3 foods again produced a statistically significant difference in results for crab meat at the 27 h time point; otherwise results of the Reveal and reference methods were statistically equivalent. Overall, considering both internal and independent laboratory trials, sensitivity of the Reveal method relative to the reference culture procedures in testing of foods was 85.9% at 27 h and 97.1% at 30 h. Results from 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the Reveal method was more productive than the reference USDA-FSIS culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the Reveal method at the 24 h time point. Overall, sensitivity of the Reveal method at 24 h relative to that of the USDA-FSIS method was 153%. The Reveal method exhibited extremely high specificity, with only a single false-positive result in all trials combined for overall specificity of 99.5%.
Comparison of hypertabastic survival model with other unimodal hazard rate functions using a goodness-of-fit test.

PubMed

Tahir, M Ramzan; Tran, Quang X; Nikulin, Mikhail S

2017-05-30

We studied the problem of testing a hypothesized distribution in survival regression models when the data is right censored and survival times are influenced by covariates. A modified chi-squared type test, known as Nikulin-Rao-Robson statistic, is applied for the comparison of accelerated failure time models. This statistic is used to test the goodness-of-fit for hypertabastic survival model and four other unimodal hazard rate functions. The results of simulation study showed that the hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution. In statistical modeling, because of its flexible shape of hazard functions, this distribution can also be used as a competitor of Birnbaum-Saunders and inverse Gaussian distributions. The results for the real data application are shown. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Fisher's method of combining dependent statistics using generalizations of the gamma distribution with applications to genetic pleiotropic associations.

PubMed

Li, Qizhai; Hu, Jiyuan; Ding, Juan; Zheng, Gang

2014-04-01

A classical approach to combine independent test statistics is Fisher's combination of $p$-values, which follows the $\\chi ^2$ distribution. When the test statistics are dependent, the gamma distribution (GD) is commonly used for the Fisher's combination test (FCT). We propose to use two generalizations of the GD: the generalized and the exponentiated GDs. We study some properties of mis-using the GD for the FCT to combine dependent statistics when one of the two proposed distributions are true. Our results show that both generalizations have better control of type I error rates than the GD, which tends to have inflated type I error rates at more extreme tails. In practice, common model selection criteria (e.g. Akaike information criterion/Bayesian information criterion) can be used to help select a better distribution to use for the FCT. A simple strategy of the two generalizations of the GD in genome-wide association studies is discussed. Applications of the results to genetic pleiotrophic associations are described, where multiple traits are tested for association with a single marker.
A method for developing design diagrams for ceramic and glass materials using fatigue data

NASA Technical Reports Server (NTRS)

Heslin, T. M.; Magida, M. B.; Forrest, K. A.

1986-01-01

The service lifetime of glass and ceramic materials can be expressed as a plot of time-to-failure versus applied stress whose plot is parametric in percent probability of failure. This type of plot is called a design diagram. Confidence interval estimates for such plots depend on the type of test that is used to generate the data, on assumptions made concerning the statistical distribution of the test results, and on the type of analysis used. This report outlines the development of design diagrams for glass and ceramic materials in engineering terms using static or dynamic fatigue tests, assuming either no particular statistical distribution of test results or a Weibull distribution and using either median value or homologous ratio analysis of the test results.
The chi-square test of independence.

PubMed

McHugh, Mary L

2013-01-01

The Chi-square statistic is a non-parametric (distribution free) tool designed to analyze group differences when the dependent variable is measured at a nominal level. Like all non-parametric statistics, the Chi-square is robust with respect to the distribution of the data. Specifically, it does not require equality of variances among the study groups or homoscedasticity in the data. It permits evaluation of both dichotomous independent variables, and of multiple group studies. Unlike many other non-parametric and some parametric statistics, the calculations needed to compute the Chi-square provide considerable information about how each of the groups performed in the study. This richness of detail allows the researcher to understand the results and thus to derive more detailed information from this statistic than from many others. The Chi-square is a significance statistic, and should be followed with a strength statistic. The Cramer's V is the most common strength test used to test the data when a significant Chi-square result has been obtained. Advantages of the Chi-square include its robustness with respect to distribution of the data, its ease of computation, the detailed information that can be derived from the test, its use in studies for which parametric assumptions cannot be met, and its flexibility in handling data from both two group and multiple group studies. Limitations include its sample size requirements, difficulty of interpretation when there are large numbers of categories (20 or more) in the independent or dependent variables, and tendency of the Cramer's V to produce relative low correlation measures, even for highly significant results.
An entropy-based statistic for genomewide association studies.

PubMed

Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao

2005-07-01

Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.
Predicting future protection of respirator users: Statistical approaches and practical implications.

PubMed

Hu, Chengcheng; Harber, Philip; Su, Jing

2016-01-01

The purpose of this article is to describe a statistical approach for predicting a respirator user's fit factor in the future based upon results from initial tests. A statistical prediction model was developed based upon joint distribution of multiple fit factor measurements over time obtained from linear mixed effect models. The model accounts for within-subject correlation as well as short-term (within one day) and longer-term variability. As an example of applying this approach, model parameters were estimated from a research study in which volunteers were trained by three different modalities to use one of two types of respirators. They underwent two quantitative fit tests at the initial session and two on the same day approximately six months later. The fitted models demonstrated correlation and gave the estimated distribution of future fit test results conditional on past results for an individual worker. This approach can be applied to establishing a criterion value for passing an initial fit test to provide reasonable likelihood that a worker will be adequately protected in the future; and to optimizing the repeat fit factor test intervals individually for each user for cost-effective testing.
Statistical methods and errors in family medicine articles between 2010 and 2014-Suez Canal University, Egypt: A cross-sectional study

PubMed Central

Nour-Eldein, Hebatallah

2016-01-01

Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839
Testing the statistical compatibility of independent data sets

NASA Astrophysics Data System (ADS)

Maltoni, M.; Schwetz, T.

2003-08-01

We discuss a goodness-of-fit method which tests the compatibility between statistically independent data sets. The method gives sensible results even in cases where the χ2 minima of the individual data sets are very low or when several parameters are fitted to a large number of data points. In particular, it avoids the problem that a possible disagreement between data sets becomes diluted by data points which are insensitive to the crucial parameters. A formal derivation of the probability distribution function for the proposed test statistics is given, based on standard theorems of statistics. The application of the method is illustrated on data from neutrino oscillation experiments, and its complementarity to the standard goodness-of-fit is discussed.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics

PubMed Central

Chen, Wenan; Larrabee, Beth R.; Ovsyannikova, Inna G.; Kennedy, Richard B.; Haralambieva, Iana H.; Poland, Gregory A.; Schaid, Daniel J.

2015-01-01

Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. PMID:25948564
A close examination of double filtering with fold change and t test in microarray analysis

PubMed Central

2009-01-01

Background Many researchers use the double filtering procedure with fold change and t test to identify differentially expressed genes, in the hope that the double filtering will provide extra confidence in the results. Due to its simplicity, the double filtering procedure has been popular with applied researchers despite the development of more sophisticated methods. Results This paper, for the first time to our knowledge, provides theoretical insight on the drawback of the double filtering procedure. We show that fold change assumes all genes to have a common variance while t statistic assumes gene-specific variances. The two statistics are based on contradicting assumptions. Under the assumption that gene variances arise from a mixture of a common variance and gene-specific variances, we develop the theoretically most powerful likelihood ratio test statistic. We further demonstrate that the posterior inference based on a Bayesian mixture model and the widely used significance analysis of microarrays (SAM) statistic are better approximations to the likelihood ratio test than the double filtering procedure. Conclusion We demonstrate through hypothesis testing theory, simulation studies and real data examples, that well constructed shrinkage testing methods, which can be united under the mixture gene variance assumption, can considerably outperform the double filtering procedure. PMID:19995439

Application of Statistics in Engineering Technology Programs

ERIC Educational Resources Information Center

Zhan, Wei; Fink, Rainer; Fang, Alex

2010-01-01

Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. Traditionally, however, statistics is not extensively used in undergraduate engineering technology (ET) programs, resulting in a major disconnect from industry…
Comparisons of false negative rates from a trend test alone and from a trend test jointly with a control-high groups pairwise test in the determination of the carcinogenicity of new drugs.

PubMed

Lin, Karl K; Rahman, Mohammad A

2018-05-21

Interest has been expressed in using a joint test procedure that requires that the results of both a trend test and a pairwise comparison test between the control and the high groups be statistically significant simultaneously at the levels of significance recommended in the FDA 2001 draft guidance for industry document for the separate tests in order for the drug effect on the development of an individual tumor type to be considered as statistically significant. Results of our simulation studies show that there is a serious consequence of large inflations of the false negative rate through large decreases of false positive rate in the use of the above joint test procedure in the final interpretation of the carcinogenicity potential of a new drug if the levels of significance recommended for separate tests are used. The inflation can be as high as 204.5% of the false negative rate when the trend test alone is required to test if the effect is statistically significant. To correct the problem, new sets of levels of significance have also been developed for those who want to use the joint test in reviews of carcinogenicity studies.
Probability of detection of internal voids in structural ceramics using microfocus radiography

NASA Technical Reports Server (NTRS)

Baaklini, G. Y.; Roth, D. J.

1986-01-01

The reliability of microfocous X-radiography for detecting subsurface voids in structural ceramic test specimens was statistically evaluated. The microfocus system was operated in the projection mode using low X-ray photon energies (20 keV) and a 10 micro m focal spot. The statistics were developed for implanted subsurface voids in green and sintered silicon carbide and silicon nitride test specimens. These statistics were compared with previously-obtained statistics for implanted surface voids in similar specimens. Problems associated with void implantation are discussed. Statistical results are given as probability-of-detection curves at a 95 precent confidence level for voids ranging in size from 20 to 528 micro m in diameter.
Probability of detection of internal voids in structural ceramics using microfocus radiography

NASA Technical Reports Server (NTRS)

Baaklini, G. Y.; Roth, D. J.

1985-01-01

The reliability of microfocus x-radiography for detecting subsurface voids in structural ceramic test specimens was statistically evaluated. The microfocus system was operated in the projection mode using low X-ray photon energies (20 keV) and a 10 micro m focal spot. The statistics were developed for implanted subsurface voids in green and sintered silicon carbide and silicon nitride test specimens. These statistics were compared with previously-obtained statistics for implanted surface voids in similar specimens. Problems associated with void implantation are discussed. Statistical results are given as probability-of-detection curves at a 95 percent confidence level for voids ranging in size from 20 to 528 micro m in diameter.
Admixture, Population Structure, and F-Statistics.

PubMed

Peter, Benjamin M

2016-04-01

Many questions about human genetic history can be addressed by examining the patterns of shared genetic variation between sets of populations. A useful methodological framework for this purpose isF-statistics that measure shared genetic drift between sets of two, three, and four populations and can be used to test simple and complex hypotheses about admixture between populations. This article provides context from phylogenetic and population genetic theory. I review how F-statistics can be interpreted as branch lengths or paths and derive new interpretations, using coalescent theory. I further show that the admixture tests can be interpreted as testing general properties of phylogenies, allowing extension of some ideas applications to arbitrary phylogenetic trees. The new results are used to investigate the behavior of the statistics under different models of population structure and show how population substructure complicates inference. The results lead to simplified estimators in many cases, and I recommend to replace F3 with the average number of pairwise differences for estimating population divergence. Copyright © 2016 by the Genetics Society of America.
Evidential Value That Exercise Improves BMI z-Score in Overweight and Obese Children and Adolescents

PubMed Central

Kelley, George A.; Kelley, Kristi S.

2015-01-01

Background. Given the cardiovascular disease (CVD) related importance of understanding the true effects of exercise on adiposity in overweight and obese children and adolescents, this study examined whether there is evidential value to rule out excessive and inappropriate reporting of statistically significant results, a major problem in the published literature, with respect to exercise-induced improvements in BMI z-score among overweight and obese children and adolescents. Methods. Using data from a previous meta-analysis of 10 published studies that included 835 overweight and obese children and adolescents, a novel, recently developed approach (p-curve) was used to test for evidential value and rule out selective reporting of findings. Chi-squared tests (χ 2) were used to test for statistical significance with alpha (p) values <0.05 considered statistically significant. Results. Six of 10 findings (60%) were statistically significant. Statistically significant right-skew to rule out selective reporting was found (χ 2 = 38.8, p = 0.0001). Conversely, studies neither lacked evidential value (χ 2 = 6.8, p = 0.87) nor lacked evidential value and were intensely p-hacked (χ 2 = 4.3, p = 0.98). Conclusion. Evidential value results confirm that exercise reduces BMI z-score in overweight and obese children and adolescents, an important therapeutic strategy for treating and preventing CVD. PMID:26509145
Evidential Value That Exercise Improves BMI z-Score in Overweight and Obese Children and Adolescents.

PubMed

Kelley, George A; Kelley, Kristi S

2015-01-01

Background. Given the cardiovascular disease (CVD) related importance of understanding the true effects of exercise on adiposity in overweight and obese children and adolescents, this study examined whether there is evidential value to rule out excessive and inappropriate reporting of statistically significant results, a major problem in the published literature, with respect to exercise-induced improvements in BMI z-score among overweight and obese children and adolescents. Methods. Using data from a previous meta-analysis of 10 published studies that included 835 overweight and obese children and adolescents, a novel, recently developed approach (p-curve) was used to test for evidential value and rule out selective reporting of findings. Chi-squared tests (χ (2)) were used to test for statistical significance with alpha (p) values <0.05 considered statistically significant. Results. Six of 10 findings (60%) were statistically significant. Statistically significant right-skew to rule out selective reporting was found (χ (2) = 38.8, p = 0.0001). Conversely, studies neither lacked evidential value (χ (2) = 6.8, p = 0.87) nor lacked evidential value and were intensely p-hacked (χ (2) = 4.3, p = 0.98). Conclusion. Evidential value results confirm that exercise reduces BMI z-score in overweight and obese children and adolescents, an important therapeutic strategy for treating and preventing CVD.
Ranking metrics in gene set enrichment analysis: do they matter?

PubMed

Zyla, Joanna; Marczyk, Michal; Weiner, January; Polanska, Joanna

2017-05-12

There exist many methods for describing the complex relation between changes of gene expression in molecular pathways or gene ontologies under different experimental conditions. Among them, Gene Set Enrichment Analysis seems to be one of the most commonly used (over 10,000 citations). An important parameter, which could affect the final result, is the choice of a metric for the ranking of genes. Applying a default ranking metric may lead to poor results. In this work 28 benchmark data sets were used to evaluate the sensitivity and false positive rate of gene set analysis for 16 different ranking metrics including new proposals. Furthermore, the robustness of the chosen methods to sample size was tested. Using k-means clustering algorithm a group of four metrics with the highest performance in terms of overall sensitivity, overall false positive rate and computational load was established i.e. absolute value of Moderated Welch Test statistic, Minimum Significant Difference, absolute value of Signal-To-Noise ratio and Baumgartner-Weiss-Schindler test statistic. In case of false positive rate estimation, all selected ranking metrics were robust with respect to sample size. In case of sensitivity, the absolute value of Moderated Welch Test statistic and absolute value of Signal-To-Noise ratio gave stable results, while Baumgartner-Weiss-Schindler and Minimum Significant Difference showed better results for larger sample size. Finally, the Gene Set Enrichment Analysis method with all tested ranking metrics was parallelised and implemented in MATLAB, and is available at https://github.com/ZAEDPolSl/MrGSEA . Choosing a ranking metric in Gene Set Enrichment Analysis has critical impact on results of pathway enrichment analysis. The absolute value of Moderated Welch Test has the best overall sensitivity and Minimum Significant Difference has the best overall specificity of gene set analysis. When the number of non-normally distributed genes is high, using Baumgartner-Weiss-Schindler test statistic gives better outcomes. Also, it finds more enriched pathways than other tested metrics, which may induce new biological discoveries.
[A retrospective study on the assessment of dysphagia after partial laryngectomy].

PubMed

Su, T T; Sun, Z F

2017-11-07

Objective: To retrospectively investigate the long-term swallowing function of patients with laryngeal carcinoma, who underwent partial laryngectomy, discuss the effectiveness and reliability of Kubota drinking test in the assessment of patients with dysphagia, who underwent partial laryngectomy, and analyze the influence of different ways of operation on swallowing function. Methods: Clinical data were retrospectively analyzed on 83 patients with laryngeal carcinoma, who underwent partial laryngectomy between September 2012 and August 2015. Questionnaire survey, Kubota drinking test and video fluoroscopic swallowing study (VFSS) were conducted for patients during a scheduled interview. Patients were grouped by two ways: the one was whether epiglottis was retained, and the other was whether either arytenoids or both were reserved. The influence of different surgical techniques on swallowing function was analyzed according to the results of Kubota drinking test. The agreement and reliability of Kubota drinking test were statistically analyzed with respect to VFSS treated as the gold standard. SPSS23.0 software was used to analyze the data. Results: Questionnaire results revealed that among 83 patients underwent partial laryngectomy 32.53% suffered from eating disorder, and 43.37% experienced painful swallowing. The incidence of dysphagia was 40.96% according to the results of Kubota drinking test. There was statistical difference between the group with epiglottis remained and that having epiglottis removed in terms of the absence of dysphagia and severity. The statistical values of normal, moderate and severe dysphagia were in the order of 18.160, 7.229, 12.344( P <0.05). Also, statistical difference existed between the groups with either and both arytenoids reserved in terms of the absence of dysphagia as well as that of intermediate severity, and their statistical values were 4.790 and 9.110( P <0.05). A certain degree of agreement and reliability was present between the results of Kubota drinking test and VFSS( Kappa =0.551, r =0.810). Conclusions: It was of considerable significance to reserve epiglottis and arytenoids for the retention of swallowing function for patients post partial laryngectomy. There are certain degree of agreement and reliability between the results of Kubota drinking test and VFSS. The test, therefore, could be used as a tool for screening patients suffering from dysphagia post partial laryngectomy.
Design of durability test protocol for vehicular fuel cell systems operated in power-follow mode based on statistical results of on-road data

NASA Astrophysics Data System (ADS)

Xu, Liangfei; Reimer, Uwe; Li, Jianqiu; Huang, Haiyan; Hu, Zunyan; Jiang, Hongliang; Janßen, Holger; Ouyang, Minggao; Lehnert, Werner

2018-02-01

City buses using polymer electrolyte membrane (PEM) fuel cells are considered to be the most likely fuel cell vehicles to be commercialized in China. The technical specifications of the fuel cell systems (FCSs) these buses are equipped with will differ based on the powertrain configurations and vehicle control strategies, but can generally be classified into the power-follow and soft-run modes. Each mode imposes different levels of electrochemical stress on the fuel cells. Evaluating the aging behavior of fuel cell stacks under the conditions encountered in fuel cell buses requires new durability test protocols based on statistical results obtained during actual driving tests. In this study, we propose a systematic design method for fuel cell durability test protocols that correspond to the power-follow mode based on three parameters for different fuel cell load ranges. The powertrain configurations and control strategy are described herein, followed by a presentation of the statistical data for the duty cycles of FCSs in one city bus in the demonstration project. Assessment protocols are presented based on the statistical results using mathematical optimization methods, and are compared to existing protocols with respect to common factors, such as time at open circuit voltage and root-mean-square power.
Statistical Power in Evaluations That Investigate Effects on Multiple Outcomes: A Guide for Researchers

ERIC Educational Resources Information Center

Porter, Kristin E.

2016-01-01

In education research and in many other fields, researchers are often interested in testing the effectiveness of an intervention on multiple outcomes, for multiple subgroups, at multiple points in time, or across multiple treatment groups. The resulting multiplicity of statistical hypothesis tests can lead to spurious findings of effects. Multiple…
Value Added Productivity Indicators: A Statistical Comparison of the Pre-Test/Post-Test Model and Gain Model.

ERIC Educational Resources Information Center

Weerasinghe, Dash; Orsak, Timothy; Mendro, Robert

In an age of student accountability, public school systems must find procedures for identifying effective schools, classrooms, and teachers that help students continue to learn academically. As a result, researchers have been modeling schools and classrooms to calculate productivity indicators that will withstand not only statistical review but…
Student Achievement in Undergraduate Statistics: The Potential Value of Allowing Failure

ERIC Educational Resources Information Center

Ferrandino, Joseph A.

2016-01-01

This article details what resulted when I re-designed my undergraduate statistics course to allow failure as a learning strategy and focused on achievement rather than performance. A variety of within and between sample t-tests are utilized to determine the impact of unlimited test and quiz opportunities on student learning on both quizzes and…
Combination of statistical and physically based methods to assess shallow slide susceptibility at the basin scale

NASA Astrophysics Data System (ADS)

Oliveira, Sérgio C.; Zêzere, José L.; Lajas, Sara; Melo, Raquel

2017-07-01

Approaches used to assess shallow slide susceptibility at the basin scale are conceptually different depending on the use of statistical or physically based methods. The former are based on the assumption that the same causes are more likely to produce the same effects, whereas the latter are based on the comparison between forces which tend to promote movement along the slope and the counteracting forces that are resistant to motion. Within this general framework, this work tests two hypotheses: (i) although conceptually and methodologically distinct, the statistical and deterministic methods generate similar shallow slide susceptibility results regarding the model's predictive capacity and spatial agreement; and (ii) the combination of shallow slide susceptibility maps obtained with statistical and physically based methods, for the same study area, generate a more reliable susceptibility model for shallow slide occurrence. These hypotheses were tested at a small test site (13.9 km2) located north of Lisbon (Portugal), using a statistical method (the information value method, IV) and a physically based method (the infinite slope method, IS). The landslide susceptibility maps produced with the statistical and deterministic methods were combined into a new landslide susceptibility map. The latter was based on a set of integration rules defined by the cross tabulation of the susceptibility classes of both maps and analysis of the corresponding contingency tables. The results demonstrate a higher predictive capacity of the new shallow slide susceptibility map, which combines the independent results obtained with statistical and physically based models. Moreover, the combination of the two models allowed the identification of areas where the results of the information value and the infinite slope methods are contradictory. Thus, these areas were classified as uncertain and deserve additional investigation at a more detailed scale.
Explorations in Statistics: Permutation Methods

ERIC Educational Resources Information Center

Curran-Everett, Douglas

2012-01-01

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This eighth installment of "Explorations in Statistics" explores permutation methods, empiric procedures we can use to assess an experimental result--to test a null hypothesis--when we are reluctant to trust statistical…
Differences in Performance Among Test Statistics for Assessing Phylogenomic Model Adequacy.

PubMed

Duchêne, David A; Duchêne, Sebastian; Ho, Simon Y W

2018-05-18

Statistical phylogenetic analyses of genomic data depend on models of nucleotide or amino acid substitution. The adequacy of these substitution models can be assessed using a number of test statistics, allowing the model to be rejected when it is found to provide a poor description of the evolutionary process. A potentially valuable use of model-adequacy test statistics is to identify when data sets are likely to produce unreliable phylogenetic estimates, but their differences in performance are rarely explored. We performed a comprehensive simulation study to identify test statistics that are sensitive to some of the most commonly cited sources of phylogenetic estimation error. Our results show that, for many test statistics, traditional thresholds for assessing model adequacy can fail to reject the model when the phylogenetic inferences are inaccurate and imprecise. This is particularly problematic when analysing loci that have few variable informative sites. We propose new thresholds for assessing substitution model adequacy and demonstrate their effectiveness in analyses of three phylogenomic data sets. These thresholds lead to frequent rejection of the model for loci that yield topological inferences that are imprecise and are likely to be inaccurate. We also propose the use of a summary statistic that provides a practical assessment of overall model adequacy. Our approach offers a promising means of enhancing model choice in genome-scale data sets, potentially leading to improvements in the reliability of phylogenomic inference.
Effects of Long-Term Thermal Exposure on Commercially Pure Titanium Grade 2 Elevated-Temperature Tensile Properties

NASA Technical Reports Server (NTRS)

Ellis, David L.

2012-01-01

Elevated-temperature tensile testing of commercially pure titanium (CP Ti) Grade 2 was conducted for as-received commercially produced sheet and following thermal exposure at 550 and 650 K (531 and 711 F) for times up to 5000 h. The tensile testing revealed some statistical differences between the 11 thermal treatments, but most thermal treatments were statistically equivalent. Previous data from room temperature tensile testing was combined with the new data to allow regression and development of mathematical models relating tensile properties to temperature and thermal exposure. The results indicate that thermal exposure temperature has a very small effect, whereas the thermal exposure duration has no statistically significant effects on the tensile properties. These results indicate that CP Ti Grade 2 will be thermally stable and suitable for long-duration space missions.
Upward Flame Propagation and Wire Insulation Flammability: 2006 Round Robin Data Analysis

NASA Technical Reports Server (NTRS)

Hirsch, David B.

2007-01-01

This viewgraph document reviews test results from tests of different material used for wire insulation for flame propagation and flammability. The presentation focused on investigating data variability both within and between laboratories; evaluated the between-laboratory consistency through consistency statistic h, which indicates how one laboratory s cell average compares with averages from other labs; evaluated the within-laboratory consistency through the consistency statistic k, which is an indicator of how one laboratory s within-laboratory variability compares with the variability of other labs combined; and extreme results were tested to determine whether they resulted by chance or from nonrandom causes (human error, instrument calibration shift, non-adherence to procedures, etc.)
Detecting Genomic Clustering of Risk Variants from Sequence Data: Cases vs. Controls

PubMed Central

Schaid, Daniel J.; Sinnwell, Jason P.; McDonnell, Shannon K.; Thibodeau, Stephen N.

2013-01-01

As the ability to measure dense genetic markers approaches the limit of the DNA sequence itself, taking advantage of possible clustering of genetic variants in, and around, a gene would benefit genetic association analyses, and likely provide biological insights. The greatest benefit might be realized when multiple rare variants cluster in a functional region. Several statistical tests have been developed, one of which is based on the popular Kulldorff scan statistic for spatial clustering of disease. We extended another popular spatial clustering method – Tango’s statistic – to genomic sequence data. An advantage of Tango’s method is that it is rapid to compute, and when single test statistic is computed, its distribution is well approximated by a scaled chi-square distribution, making computation of p-values very rapid. We compared the Type-I error rates and power of several clustering statistics, as well as the omnibus sequence kernel association test (SKAT). Although our version of Tango’s statistic, which we call “Kernel Distance” statistic, took approximately half the time to compute than the Kulldorff scan statistic, it had slightly less power than the scan statistic. Our results showed that the Ionita-Laza version of Kulldorff’s scan statistic had the greatest power over a range of clustering scenarios. PMID:23842950
Statistical procedures for evaluating daily and monthly hydrologic model predictions

USGS Publications Warehouse

Coffey, M.E.; Workman, S.R.; Taraba, J.L.; Fogle, A.W.

2004-01-01

The overall study objective was to evaluate the applicability of different qualitative and quantitative methods for comparing daily and monthly SWAT computer model hydrologic streamflow predictions to observed data, and to recommend statistical methods for use in future model evaluations. Statistical methods were tested using daily streamflows and monthly equivalent runoff depths. The statistical techniques included linear regression, Nash-Sutcliffe efficiency, nonparametric tests, t-test, objective functions, autocorrelation, and cross-correlation. None of the methods specifically applied to the non-normal distribution and dependence between data points for the daily predicted and observed data. Of the tested methods, median objective functions, sign test, autocorrelation, and cross-correlation were most applicable for the daily data. The robust coefficient of determination (CD*) and robust modeling efficiency (EF*) objective functions were the preferred methods for daily model results due to the ease of comparing these values with a fixed ideal reference value of one. Predicted and observed monthly totals were more normally distributed, and there was less dependence between individual monthly totals than was observed for the corresponding predicted and observed daily values. More statistical methods were available for comparing SWAT model-predicted and observed monthly totals. The 1995 monthly SWAT model predictions and observed data had a regression Rr2 of 0.70, a Nash-Sutcliffe efficiency of 0.41, and the t-test failed to reject the equal data means hypothesis. The Nash-Sutcliffe coefficient and the R r2 coefficient were the preferred methods for monthly results due to the ability to compare these coefficients to a set ideal value of one.

Using the Bootstrap Method to Evaluate the Critical Range of Misfit for Polytomous Rasch Fit Statistics.

PubMed

Seol, Hyunsoo

2016-06-01

The purpose of this study was to apply the bootstrap procedure to evaluate how the bootstrapped confidence intervals (CIs) for polytomous Rasch fit statistics might differ according to sample sizes and test lengths in comparison with the rule-of-thumb critical value of misfit. A total of 25 simulated data sets were generated to fit the Rasch measurement and then a total of 1,000 replications were conducted to compute the bootstrapped CIs under each of 25 testing conditions. The results showed that rule-of-thumb critical values for assessing the magnitude of misfit were not applicable because the infit and outfit mean square error statistics showed different magnitudes of variability over testing conditions and the standardized fit statistics did not exactly follow the standard normal distribution. Further, they also do not share the same critical range for the item and person misfit. Based on the results of the study, the bootstrapped CIs can be used to identify misfitting items or persons as they offer a reasonable alternative solution, especially when the distributions of the infit and outfit statistics are not well known and depend on sample size. © The Author(s) 2016.
Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics.

PubMed

Chen, Wenan; Larrabee, Beth R; Ovsyannikova, Inna G; Kennedy, Richard B; Haralambieva, Iana H; Poland, Gregory A; Schaid, Daniel J

2015-07-01

Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf. Copyright © 2015 by the Genetics Society of America.
Variability-aware compact modeling and statistical circuit validation on SRAM test array

NASA Astrophysics Data System (ADS)

Qiao, Ying; Spanos, Costas J.

2016-03-01

Variability modeling at the compact transistor model level can enable statistically optimized designs in view of limitations imposed by the fabrication technology. In this work we propose a variability-aware compact model characterization methodology based on stepwise parameter selection. Transistor I-V measurements are obtained from bit transistor accessible SRAM test array fabricated using a collaborating foundry's 28nm FDSOI technology. Our in-house customized Monte Carlo simulation bench can incorporate these statistical compact models; and simulation results on SRAM writability performance are very close to measurements in distribution estimation. Our proposed statistical compact model parameter extraction methodology also has the potential of predicting non-Gaussian behavior in statistical circuit performances through mixtures of Gaussian distributions.
Evaluation of EMIT and RIA high volume test procedures for THC metabolites in urine utilizing GC/MS confirmation.

PubMed

Abercrombie, M L; Jewell, J S

1986-01-01

Results of EMIT, Abuscreen RIA, and GC/MS tests for THC metabolites in a high volume random urinalysis program are compared. Samples were field tested by non-laboratory personnel with an EMIT system using a 100 ng/mL cutoff. Samples were then sent to the Army Forensic Toxicology Drug Testing Laboratory (WRAMC) at Fort Meade, Maryland, where they were tested by RIA (Abuscreen) using a statistical 100 ng/mL cutoff. Confirmations of all RIA positives were accomplished using a GC/MS procedure. EMIT and RIA results agreed for 91% of samples. Data indicated a 4% false positive rate and a 10% false negative rate for EMIT field testing. In a related study, results for samples which tested positive by RIA for THC metabolites using a statistical 100 ng/mL cutoff were compared with results by GC/MS utilizing a 20 ng/mL cutoff for the THCA metabolite. Presence of THCA metabolite was detected in 99.7% of RIA positive samples. No relationship between quantitations determined by the two tests was found.
Langley Wind Tunnel Data Quality Assurance-Check Standard Results

NASA Technical Reports Server (NTRS)

Hemsch, Michael J.; Grubb, John P.; Krieger, William B.; Cler, Daniel L.

2000-01-01

A framework for statistical evaluation, control and improvement of wind funnel measurement processes is presented The methodology is adapted from elements of the Measurement Assurance Plans developed by the National Bureau of Standards (now the National Institute of Standards and Technology) for standards and calibration laboratories. The present methodology is based on the notions of statistical quality control (SQC) together with check standard testing and a small number of customer repeat-run sets. The results of check standard and customer repeat-run -sets are analyzed using the statistical control chart-methods of Walter A. Shewhart long familiar to the SQC community. Control chart results are presented for. various measurement processes in five facilities at Langley Research Center. The processes include test section calibration, force and moment measurements with a balance, and instrument calibration.
mvp - an open-source preprocessor for cleaning duplicate records and missing values in mass spectrometry data.

PubMed

Lee, Geunho; Lee, Hyun Beom; Jung, Byung Hwa; Nam, Hojung

2017-07-01

Mass spectrometry (MS) data are used to analyze biological phenomena based on chemical species. However, these data often contain unexpected duplicate records and missing values due to technical or biological factors. These 'dirty data' problems increase the difficulty of performing MS analyses because they lead to performance degradation when statistical or machine-learning tests are applied to the data. Thus, we have developed missing values preprocessor (mvp), an open-source software for preprocessing data that might include duplicate records and missing values. mvp uses the property of MS data in which identical chemical species present the same or similar values for key identifiers, such as the mass-to-charge ratio and intensity signal, and forms cliques via graph theory to process dirty data. We evaluated the validity of the mvp process via quantitative and qualitative analyses and compared the results from a statistical test that analyzed the original and mvp-applied data. This analysis showed that using mvp reduces problems associated with duplicate records and missing values. We also examined the effects of using unprocessed data in statistical tests and examined the improved statistical test results obtained with data preprocessed using mvp.
Trends in statistical methods in articles published in Archives of Plastic Surgery between 2012 and 2017.

PubMed

Han, Kyunghwa; Jung, Inkyung

2018-05-01

This review article presents an assessment of trends in statistical methods and an evaluation of their appropriateness in articles published in the Archives of Plastic Surgery (APS) from 2012 to 2017. We reviewed 388 original articles published in APS between 2012 and 2017. We categorized the articles that used statistical methods according to the type of statistical method, the number of statistical methods, and the type of statistical software used. We checked whether there were errors in the description of statistical methods and results. A total of 230 articles (59.3%) published in APS between 2012 and 2017 used one or more statistical method. Within these articles, there were 261 applications of statistical methods with continuous or ordinal outcomes, and 139 applications of statistical methods with categorical outcome. The Pearson chi-square test (17.4%) and the Mann-Whitney U test (14.4%) were the most frequently used methods. Errors in describing statistical methods and results were found in 133 of the 230 articles (57.8%). Inadequate description of P-values was the most common error (39.1%). Among the 230 articles that used statistical methods, 71.7% provided details about the statistical software programs used for the analyses. SPSS was predominantly used in the articles that presented statistical analyses. We found that the use of statistical methods in APS has increased over the last 6 years. It seems that researchers have been paying more attention to the proper use of statistics in recent years. It is expected that these positive trends will continue in APS.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Labby, Z.

Physicists are often expected to have a solid grounding in experimental design and statistical analysis, sometimes filling in when biostatisticians or other experts are not available for consultation. Unfortunately, graduate education on these topics is seldom emphasized and few opportunities for continuing education exist. Clinical physicists incorporate new technology and methods into their practice based on published literature. A poor understanding of experimental design and analysis could Result in inappropriate use of new techniques. Clinical physicists also improve current practice through quality initiatives that require sound experimental design and analysis. Academic physicists with a poor understanding of design and analysismore » may produce ambiguous (or misleading) results. This can Result in unnecessary rewrites, publication rejection, and experimental redesign (wasting time, money, and effort). This symposium will provide a practical review of error and uncertainty, common study designs, and statistical tests. Instruction will primarily focus on practical implementation through examples and answer questions such as: where would you typically apply the test/design and where is the test/design typically misapplied (i.e., common pitfalls)? An analysis of error and uncertainty will also be explored using biological studies and associated modeling as a specific use case. Learning Objectives: Understand common experimental testing and clinical trial designs, what questions they can answer, and how to interpret the results Determine where specific statistical tests are appropriate and identify common pitfalls Understand the how uncertainty and error are addressed in biological testing and associated biological modeling.« less
First Results from the Light and Spectroscopy Concept Inventory

ERIC Educational Resources Information Center

Bardar, Erin M.

2008-01-01

This article presents results from a two-semester field test of the Light and Spectroscopy Concept Inventory (LSCI). Statistical analysis indicates that the LSCI has the sensitivity to measure statistically significant changes in students' understanding of light-related topics due to instruction in introductory astronomy courses and to distinguish…
Computer Administering of the Psychological Investigations: Set-Relational Representation

NASA Astrophysics Data System (ADS)

Yordzhev, Krasimir

Computer administering of a psychological investigation is the computer representation of the entire procedure of psychological assessments - test construction, test implementation, results evaluation, storage and maintenance of the developed database, its statistical processing, analysis and interpretation. A mathematical description of psychological assessment with the aid of personality tests is discussed in this article. The set theory and the relational algebra are used in this description. A relational model of data, needed to design a computer system for automation of certain psychological assessments is given. Some finite sets and relation on them, which are necessary for creating a personality psychological test, are described. The described model could be used to develop real software for computer administering of any psychological test and there is full automation of the whole process: test construction, test implementation, result evaluation, storage of the developed database, statistical implementation, analysis and interpretation. A software project for computer administering personality psychological tests is suggested.
The use of imputed sibling genotypes in sibship-based association analysis: on modeling alternatives, power and model misspecification.

PubMed

Minică, Camelia C; Dolan, Conor V; Hottenga, Jouke-Jan; Willemsen, Gonneke; Vink, Jacqueline M; Boomsma, Dorret I

2013-05-01

When phenotypic, but no genotypic data are available for relatives of participants in genetic association studies, previous research has shown that family-based imputed genotypes can boost the statistical power when included in such studies. Here, using simulations, we compared the performance of two statistical approaches suitable to model imputed genotype data: the mixture approach, which involves the full distribution of the imputed genotypes and the dosage approach, where the mean of the conditional distribution features as the imputed genotype. Simulations were run by varying sibship size, size of the phenotypic correlations among siblings, imputation accuracy and minor allele frequency of the causal SNP. Furthermore, as imputing sibling data and extending the model to include sibships of size two or greater requires modeling the familial covariance matrix, we inquired whether model misspecification affects power. Finally, the results obtained via simulations were empirically verified in two datasets with continuous phenotype data (height) and with a dichotomous phenotype (smoking initiation). Across the settings considered, the mixture and the dosage approach are equally powerful and both produce unbiased parameter estimates. In addition, the likelihood-ratio test in the linear mixed model appears to be robust to the considered misspecification in the background covariance structure, given low to moderate phenotypic correlations among siblings. Empirical results show that the inclusion in association analysis of imputed sibling genotypes does not always result in larger test statistic. The actual test statistic may drop in value due to small effect sizes. That is, if the power benefit is small, that the change in distribution of the test statistic under the alternative is relatively small, the probability is greater of obtaining a smaller test statistic. As the genetic effects are typically hypothesized to be small, in practice, the decision on whether family-based imputation could be used as a means to increase power should be informed by prior power calculations and by the consideration of the background correlation.
A Statistical Approach to Establishing Subsystem Environmental Test Specifications

NASA Technical Reports Server (NTRS)

Keegan, W. B.

1974-01-01

Results are presented of a research task to evaluate structural responses at various subsystem mounting locations during spacecraft level test exposures to the environments of mechanical shock, acoustic noise, and random vibration. This statistical evaluation is presented in the form of recommended subsystem test specifications for these three environments as normalized to a reference set of spacecraft test levels and are thus suitable for extrapolation to a set of different spacecraft test levels. The recommendations are dependent upon a subsystem's mounting location in a spacecraft, and information is presented on how to determine this mounting zone for a given subsystem.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ade, P. A. R.; Aghanim, N.; Akrami, Y.

In this paper, we test the statistical isotropy and Gaussianity of the cosmic microwave background (CMB) anisotropies using observations made by the Planck satellite. Our results are based mainly on the full Planck mission for temperature, but also include some polarization measurements. In particular, we consider the CMB anisotropy maps derived from the multi-frequency Planck data by several component-separation methods. For the temperature anisotropies, we find excellent agreement between results based on these sky maps over both a very large fraction of the sky and a broad range of angular scales, establishing that potential foreground residuals do not affect ourmore » studies. Tests of skewness, kurtosis, multi-normality, N-point functions, and Minkowski functionals indicate consistency with Gaussianity, while a power deficit at large angular scales is manifested in several ways, for example low map variance. The results of a peak statistics analysis are consistent with the expectations of a Gaussian random field. The “Cold Spot” is detected with several methods, including map kurtosis, peak statistics, and mean temperature profile. We thoroughly probe the large-scale dipolar power asymmetry, detecting it with several independent tests, and address the subject of a posteriori correction. Tests of directionality suggest the presence of angular clustering from large to small scales, but at a significance that is dependent on the details of the approach. We perform the first examination of polarization data, finding the morphology of stacked peaks to be consistent with the expectations of statistically isotropic simulations. Finally, where they overlap, these results are consistent with the Planck 2013 analysis based on the nominal mission data and provide our most thorough view of the statistics of the CMB fluctuations to date.« less
Statistical test for ΔρDCCA cross-correlation coefficient

NASA Astrophysics Data System (ADS)

Guedes, E. F.; Brito, A. A.; Oliveira Filho, F. M.; Fernandez, B. F.; de Castro, A. P. N.; da Silva Filho, A. M.; Zebende, G. F.

2018-07-01

In this paper we propose a new statistical test for ΔρDCCA, Detrended Cross-Correlation Coefficient Difference, a tool to measure contagion/interdependence effect in time series of size N at different time scale n. For this proposition we analyzed simulated and real time series. The results showed that the statistical significance of ΔρDCCA depends on the size N and the time scale n, and we can define a critical value for this dependency in 90%, 95%, and 99% of confidence level, as will be shown in this paper.
Cosmic shear measurements with Dark Energy Survey Science Verification data

DOE PAGES

Becker, M. R.

2016-07-06

Here, we present measurements of weak gravitational lensing cosmic shear two-point statistics using Dark Energy Survey Science Verification data. We demonstrate that our results are robust to the choice of shear measurement pipeline, either ngmix or im3shape, and robust to the choice of two-point statistic, including both real and Fourier-space statistics. Our results pass a suite of null tests including tests for B-mode contamination and direct tests for any dependence of the two-point functions on a set of 16 observing conditions and galaxy properties, such as seeing, airmass, galaxy color, galaxy magnitude, etc. We use a large suite of simulationsmore » to compute the covariance matrix of the cosmic shear measurements and assign statistical significance to our null tests. We find that our covariance matrix is consistent with the halo model prediction, indicating that it has the appropriate level of halo sample variance. We also compare the same jackknife procedure applied to the data and the simulations in order to search for additional sources of noise not captured by the simulations. We find no statistically significant extra sources of noise in the data. The overall detection significance with tomography for our highest source density catalog is 9.7σ. Cosmological constraints from the measurements in this work are presented in a companion paper.« less
Statistics in three biomedical journals.

PubMed

Pilcík, T

2003-01-01

In this paper we analyze the use of statistics and associated problems, in three Czech biological journals in the year 2000. We investigated 23 articles Folia Biologica, 60 articles in Folia Microbiologica, and 88 articles in Physiological Research. The highest frequency of publications with statistical content have used descriptive statistics and t-test. The most usual mistake concerns the absence of reference about the used statistical software and insufficient description of the data. We have compared our results with the results of similar studies in some other medical journals. The use of important statistical methods is comparable with those used in most medical journals, the proportion of articles, in which the applied method is described insufficiently is moderately low.
Detection of nonlinear transfer functions by the use of Gaussian statistics

NASA Technical Reports Server (NTRS)

Sheppard, J. G.

1972-01-01

The possibility of using on-line signal statistics to detect electronic equipment nonlinearities is discussed. The results of an investigation using Gaussian statistics are presented, and a nonlinearity test that uses ratios of the moments of a Gaussian random variable is developed and discussed. An outline for further investigation is presented.
Predicting Slag Generation in Sub-Scale Test Motors Using a Neural Network

NASA Technical Reports Server (NTRS)

Wiesenberg, Brent

1999-01-01

Generation of slag (aluminum oxide) is an important issue for the Reusable Solid Rocket Motor (RSRM). Thiokol performed testing to quantify the relationship between raw material variations and slag generation in solid propellants by testing sub-scale motors cast with propellant containing various combinations of aluminum fuel and ammonium perchlorate (AP) oxidizer particle sizes. The test data were analyzed using statistical methods and an artificial neural network. This paper primarily addresses the neural network results with some comparisons to the statistical results. The neural network showed that the particle sizes of both the aluminum and unground AP have a measurable effect on slag generation. The neural network analysis showed that aluminum particle size is the dominant driver in slag generation, about 40% more influential than AP. The network predictions of the amount of slag produced during firing of sub-scale motors were 16% better than the predictions of a statistically derived empirical equation. Another neural network successfully characterized the slag generated during full-scale motor tests. The success is attributable to the ability of neural networks to characterize multiple complex factors including interactions that affect slag generation.
An evaluation of grease type ball bearing lubricants operating in various environments

NASA Technical Reports Server (NTRS)

Mcmurtrey, E. L.

1979-01-01

Grease type lubricants in bearings were evaluated in five different adverse environments for a one year period. Four repetitions of each test were made to provide statistical samples. These tests were then used to select four lubricants for five year tests in selected environments with five repetitions of each test for statistical samples. Three five year tests were started in (1) continuous operation; (2) start-stop operation, with both in vacuum at ambient temperatures; and (3) continuous operation at 93.3 C. To date, in the one year tests, the best results in all environments were obtained with a high viscosity index perfluoroalkylpolyether grease.
Mysid (Mysidopsis bahia) life-cycle test: Design comparisons and assessment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lussier, S.M.; Champlin, D.; Kuhn, A.

1996-12-31

This study examines ASTM Standard E1191-90, ``Standard Guide for Conducting Life-cycle Toxicity Tests with Saltwater Mysids,`` 1990, using Mysidopsis bahia, by comparing several test designs to assess growth, reproduction, and survival. The primary objective was to determine the most labor efficient and statistically powerful test design for the measurement of statistically detectable effects on biologically sensitive endpoints. Five different test designs were evaluated varying compartment size, number of organisms per compartment and sex ratio. Results showed that while paired organisms in the ASTM design had the highest rate of reproduction among designs tested, no individual design had greater statistical powermore » to detect differences in reproductive effects. Reproduction was not statistically different between organisms paired in the ASTM design and those with randomized sex ratios using larger test compartments. These treatments had numerically higher reproductive success and lower within tank replicate variance than treatments using smaller compartments where organisms were randomized, or had a specific sex ratio. In this study, survival and growth were not statistically different among designs tested. Within tank replicate variability can be reduced by using many exposure compartments with pairs, or few compartments with many organisms in each. While this improves variance within replicate chambers, it does not strengthen the power of detection among treatments in the test. An increase in the number of true replicates (exposure chambers) to eight will have the effect of reducing the percent detectable difference by a factor of two.« less

General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies

PubMed Central

Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong

2013-01-01

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Suggestions for presenting the results of data analyses

USGS Publications Warehouse

Anderson, David R.; Link, William A.; Johnson, Douglas H.; Burnham, Kenneth P.

2001-01-01

We give suggestions for the presentation of research results from frequentist, information-theoretic, and Bayesian analysis paradigms, followed by several general suggestions. The information-theoretic and Bayesian methods offer alternative approaches to data analysis and inference compared to traditionally used methods. Guidance is lacking on the presentation of results under these alternative procedures and on nontesting aspects of classical frequentists methods of statistical analysis. Null hypothesis testing has come under intense criticism. We recommend less reporting of the results of statistical tests of null hypotheses in cases where the null is surely false anyway, or where the null hypothesis is of little interest to science or management.
Semantically enabled and statistically supported biological hypothesis testing with tissue microarray databases

PubMed Central

2011-01-01

Background Although many biological databases are applying semantic web technologies, meaningful biological hypothesis testing cannot be easily achieved. Database-driven high throughput genomic hypothesis testing requires both of the capabilities of obtaining semantically relevant experimental data and of performing relevant statistical testing for the retrieved data. Tissue Microarray (TMA) data are semantically rich and contains many biologically important hypotheses waiting for high throughput conclusions. Methods An application-specific ontology was developed for managing TMA and DNA microarray databases by semantic web technologies. Data were represented as Resource Description Framework (RDF) according to the framework of the ontology. Applications for hypothesis testing (Xperanto-RDF) for TMA data were designed and implemented by (1) formulating the syntactic and semantic structures of the hypotheses derived from TMA experiments, (2) formulating SPARQLs to reflect the semantic structures of the hypotheses, and (3) performing statistical test with the result sets returned by the SPARQLs. Results When a user designs a hypothesis in Xperanto-RDF and submits it, the hypothesis can be tested against TMA experimental data stored in Xperanto-RDF. When we evaluated four previously validated hypotheses as an illustration, all the hypotheses were supported by Xperanto-RDF. Conclusions We demonstrated the utility of high throughput biological hypothesis testing. We believe that preliminary investigation before performing highly controlled experiment can be benefited. PMID:21342584
Evaluation of cognitive characteristics of patients developing manifestations of parkinsonism secondary to long-term ephedrone use.

PubMed

Koksal, Ayhan; Keskinkılıc, Cahit; Sozmen, Mehmet Vedat; Dirican, Ayten Ceyhan; Aysal, Fikret; Altunkaynak, Yavuz; Baybas, Sevim

2014-01-01

In this study, cognitive functions of 9 patients developing parkinsonism due to chronic manganese intoxication by intravenous methcathinone solution were investigated using detailed neuropsychometric tests. Attention deficit, verbal and nonverbal memory, visuospatial function, constructive ability, language, and executive (frontal) functions of 9 patients who were admitted to our clinic with manifestations of chronic manganese intoxication and 9 control subjects were assessed using neuropsychometric tests. Two years later, detailed repeat neuropsychometric tests were performed in the patient group. The results were evaluated using the χ(2) test, Fisher's exact probability test, Student's t test and the Mann-Whitney U test. While there was no statistically significant difference between the two groups in language functions, visuospatial functions and constructive ability, a statistically significant difference was noted between both groups regarding attention (p = 0.032), calculation (p = 0.004), recall and recognition domains of verbal memory, nonverbal memory (p = 0.021) and some domains of frontal functions (Stroop-5 and spontaneous recovery) (p = 0.022 and 0.012). Repeat neuropsychometric test results of the patients were not statistically significant 2 years later. It has been observed that cognitive dysfunction seen in parkinsonism secondary to chronic manganese intoxication may be long-lasting and may not recover as observed in motor dysfunction. © 2014 S. Karger AG, Basel.
IMPLEMENTATION AND VALIDATION OF STATISTICAL TESTS IN RESEARCH'S SOFTWARE HELPING DATA COLLECTION AND PROTOCOLS ANALYSIS IN SURGERY.

PubMed

Kuretzki, Carlos Henrique; Campos, Antônio Carlos Ligocki; Malafaia, Osvaldo; Soares, Sandramara Scandelari Kusano de Paula; Tenório, Sérgio Bernardo; Timi, Jorge Rufino Ribas

2016-03-01

The use of information technology is often applied in healthcare. With regard to scientific research, the SINPE(c) - Integrated Electronic Protocols was created as a tool to support researchers, offering clinical data standardization. By the time, SINPE(c) lacked statistical tests obtained by automatic analysis. Add to SINPE(c) features for automatic realization of the main statistical methods used in medicine . The study was divided into four topics: check the interest of users towards the implementation of the tests; search the frequency of their use in health care; carry out the implementation; and validate the results with researchers and their protocols. It was applied in a group of users of this software in their thesis in the strict sensu master and doctorate degrees in one postgraduate program in surgery. To assess the reliability of the statistics was compared the data obtained both automatically by SINPE(c) as manually held by a professional in statistics with experience with this type of study. There was concern for the use of automatic statistical tests, with good acceptance. The chi-square, Mann-Whitney, Fisher and t-Student were considered as tests frequently used by participants in medical studies. These methods have been implemented and thereafter approved as expected. The incorporation of the automatic SINPE (c) Statistical Analysis was shown to be reliable and equal to the manually done, validating its use as a research tool for medical research.
Texture and haptic cues in slant discrimination: reliability-based cue weighting without statistically optimal cue combination

NASA Astrophysics Data System (ADS)

Rosas, Pedro; Wagemans, Johan; Ernst, Marc O.; Wichmann, Felix A.

2005-05-01

A number of models of depth-cue combination suggest that the final depth percept results from a weighted average of independent depth estimates based on the different cues available. The weight of each cue in such an average is thought to depend on the reliability of each cue. In principle, such a depth estimation could be statistically optimal in the sense of producing the minimum-variance unbiased estimator that can be constructed from the available information. Here we test such models by using visual and haptic depth information. Different texture types produce differences in slant-discrimination performance, thus providing a means for testing a reliability-sensitive cue-combination model with texture as one of the cues to slant. Our results show that the weights for the cues were generally sensitive to their reliability but fell short of statistically optimal combination - we find reliability-based reweighting but not statistically optimal cue combination.
Comparison of Piezosurgery and Conventional Rotary Instruments for Removal of Impacted Mandibular Third Molars: A Randomized Controlled Clinical and Radiographic Trial

PubMed Central

Shokry, Mohamed; Aboelsaad, Nayer

2016-01-01

The purpose of this study was to test the effect of the surgical removal of impacted mandibular third molars using piezosurgery versus the conventional surgical technique on postoperative sequelae and bone healing. Material and Methods. This study was carried out as a randomized controlled clinical trial: split mouth design. Twenty patients with bilateral mandibular third molar mesioangular impaction class II position B indicated for surgical extraction were treated randomly using either the piezosurgery or the conventional bur technique on each site. Duration of the procedure, postoperative edema, trismus, pain, healing, and bone density and quantity were evaluated up to 6 months postoperatively. Results. Test and control sites were compared using paired t-test. There was statistical significance in reduction of pain and swelling in test sites, where the time of the procedure was statistically increased in test site. For bone quantity and quality, statistical difference was found where test site showed better results. Conclusion. Piezosurgery technique improves quality of patient's life in form of decrease of postoperative pain, trismus, and swelling. Furthermore, it enhances bone quality within the extraction socket and bone quantity along the distal aspect of the mandibular second molar. PMID:27597866
Does Training in Table Creation Enhance Table Interpretation? A Quasi-Experimental Study with Follow-Up

ERIC Educational Resources Information Center

Karazsia, Bryan T.; Wong, Kendal

2016-01-01

Quantitative and statistical literacy are core domains in the undergraduate psychology curriculum. An important component of such literacy includes interpretation of visual aids, such as tables containing results from statistical analyses. This article presents results of a quasi-experimental study with longitudinal follow-up that tested the…
Using the Descriptive Bootstrap to Evaluate Result Replicability (Because Statistical Significance Doesn't)

ERIC Educational Resources Information Center

Spinella, Sarah

2011-01-01

As result replicability is essential to science and difficult to achieve through external replicability, the present paper notes the insufficiency of null hypothesis statistical significance testing (NHSST) and explains the bootstrap as a plausible alternative, with a heuristic example to illustrate the bootstrap method. The bootstrap relies on…
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity

PubMed Central

Beasley, T. Mark

2013-01-01

Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952
Comparison of skid resistance testing to stopping distance testing.

DOT National Transportation Integrated Search

1998-01-01

This report is intended to statistically summarize the results of a side-by-side test of the skid resistance testing trailer utilized by the Oregon Department of Transportation (ODOT), and the stopping distance car utilized by the Oregon State Police...
49 CFR 199.117 - Recordkeeping.

Code of Federal Regulations, 2011 CFR

2011-10-01

... ADMINISTRATION, DEPARTMENT OF TRANSPORTATION (CONTINUED) PIPELINE SAFETY DRUG AND ALCOHOL TESTING Drug Testing... provided by DOT Procedures. Statistical data related to drug testing and rehabilitation that is not name... employee drug test that indicate a verified positive result, records that demonstrate compliance with the...
49 CFR 199.117 - Recordkeeping.

Code of Federal Regulations, 2013 CFR

2013-10-01

... ADMINISTRATION, DEPARTMENT OF TRANSPORTATION (CONTINUED) PIPELINE SAFETY DRUG AND ALCOHOL TESTING Drug Testing... provided by DOT Procedures. Statistical data related to drug testing and rehabilitation that is not name... employee drug test that indicate a verified positive result, records that demonstrate compliance with the...
49 CFR 199.117 - Recordkeeping.

Code of Federal Regulations, 2012 CFR

2012-10-01

... ADMINISTRATION, DEPARTMENT OF TRANSPORTATION (CONTINUED) PIPELINE SAFETY DRUG AND ALCOHOL TESTING Drug Testing... provided by DOT Procedures. Statistical data related to drug testing and rehabilitation that is not name... employee drug test that indicate a verified positive result, records that demonstrate compliance with the...
49 CFR 199.117 - Recordkeeping.

Code of Federal Regulations, 2010 CFR

2010-10-01

... ADMINISTRATION, DEPARTMENT OF TRANSPORTATION (CONTINUED) PIPELINE SAFETY DRUG AND ALCOHOL TESTING Drug Testing... provided by DOT Procedures. Statistical data related to drug testing and rehabilitation that is not name... employee drug test that indicate a verified positive result, records that demonstrate compliance with the...
49 CFR 199.117 - Recordkeeping.

Code of Federal Regulations, 2014 CFR

2014-10-01

... ADMINISTRATION, DEPARTMENT OF TRANSPORTATION (CONTINUED) PIPELINE SAFETY DRUG AND ALCOHOL TESTING Drug Testing... provided by DOT Procedures. Statistical data related to drug testing and rehabilitation that is not name... employee drug test that indicate a verified positive result, records that demonstrate compliance with the...
A study of environmental characterization of conventional and advanced aluminum alloys for selection and design. Phase 2: The breaking load test method

NASA Technical Reports Server (NTRS)

Sprowls, D. O.; Bucci, R. J.; Ponchel, B. M.; Brazill, R. L.; Bretz, P. E.

1984-01-01

A technique is demonstrated for accelerated stress corrosion testing of high strength aluminum alloys. The method offers better precision and shorter exposure times than traditional pass fail procedures. The approach uses data from tension tests performed on replicate groups of smooth specimens after various lengths of exposure to static stress. The breaking strength measures degradation in the test specimen load carrying ability due to the environmental attack. Analysis of breaking load data by extreme value statistics enables the calculation of survival probabilities and a statistically defined threshold stress applicable to the specific test conditions. A fracture mechanics model is given which quantifies depth of attack in the stress corroded specimen by an effective flaw size calculated from the breaking stress and the material strength and fracture toughness properties. Comparisons are made with experimental results from three tempers of 7075 alloy plate tested by the breaking load method and by traditional tests of statistically loaded smooth tension bars and conventional precracked specimens.
Combining Shapley value and statistics to the analysis of gene expression data in children exposed to air pollution

PubMed Central

Moretti, Stefano; van Leeuwen, Danitsja; Gmuender, Hans; Bonassi, Stefano; van Delft, Joost; Kleinjans, Jos; Patrone, Fioravante; Merlo, Domenico Franco

2008-01-01

Background In gene expression analysis, statistical tests for differential gene expression provide lists of candidate genes having, individually, a sufficiently low p-value. However, the interpretation of each single p-value within complex systems involving several interacting genes is problematic. In parallel, in the last sixty years, game theory has been applied to political and social problems to assess the power of interacting agents in forcing a decision and, more recently, to represent the relevance of genes in response to certain conditions. Results In this paper we introduce a Bootstrap procedure to test the null hypothesis that each gene has the same relevance between two conditions, where the relevance is represented by the Shapley value of a particular coalitional game defined on a microarray data-set. This method, which is called Comparative Analysis of Shapley value (shortly, CASh), is applied to data concerning the gene expression in children differentially exposed to air pollution. The results provided by CASh are compared with the results from a parametric statistical test for testing differential gene expression. Both lists of genes provided by CASh and t-test are informative enough to discriminate exposed subjects on the basis of their gene expression profiles. While many genes are selected in common by CASh and the parametric test, it turns out that the biological interpretation of the differences between these two selections is more interesting, suggesting a different interpretation of the main biological pathways in gene expression regulation for exposed individuals. A simulation study suggests that CASh offers more power than t-test for the detection of differential gene expression variability. Conclusion CASh is successfully applied to gene expression analysis of a data-set where the joint expression behavior of genes may be critical to characterize the expression response to air pollution. We demonstrate a synergistic effect between coalitional games and statistics that resulted in a selection of genes with a potential impact in the regulation of complex pathways. PMID:18764936
Comparing Simulated and Theoretical Sampling Distributions of the U3 Person-Fit Statistic.

ERIC Educational Resources Information Center

Emons, Wilco H. M.; Meijer, Rob R.; Sijtsma, Klaas

2002-01-01

Studied whether the theoretical sampling distribution of the U3 person-fit statistic is in agreement with the simulated sampling distribution under different item response theory models and varying item and test characteristics. Simulation results suggest that the use of standard normal deviates for the standardized version of the U3 statistic may…
Uncertainty Analysis of Inertial Model Attitude Sensor Calibration and Application with a Recommended New Calibration Method

NASA Technical Reports Server (NTRS)

Tripp, John S.; Tcheng, Ping

1999-01-01

Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.

Testing homogeneity of proportion ratios for stratified correlated bilateral data in two-arm randomized clinical trials.

PubMed

Pei, Yanbo; Tian, Guo-Liang; Tang, Man-Lai

2014-11-10

Stratified data analysis is an important research topic in many biomedical studies and clinical trials. In this article, we develop five test statistics for testing the homogeneity of proportion ratios for stratified correlated bilateral binary data based on an equal correlation model assumption. Bootstrap procedures based on these test statistics are also considered. To evaluate the performance of these statistics and procedures, we conduct Monte Carlo simulations to study their empirical sizes and powers under various scenarios. Our results suggest that the procedure based on score statistic performs well generally and is highly recommended. When the sample size is large, procedures based on the commonly used weighted least square estimate and logarithmic transformation with Mantel-Haenszel estimate are recommended as they do not involve any computation of maximum likelihood estimates requiring iterative algorithms. We also derive approximate sample size formulas based on the recommended test procedures. Finally, we apply the proposed methods to analyze a multi-center randomized clinical trial for scleroderma patients. Copyright © 2014 John Wiley & Sons, Ltd.
The Sport Students’ Ability of Literacy and Statistical Reasoning

NASA Astrophysics Data System (ADS)

Hidayah, N.

2017-03-01

The ability of literacy and statistical reasoning is very important for the students of sport education college due to the materials of statistical learning can be taken from their many activities such as sport competition, the result of test and measurement, predicting achievement based on training, finding connection among variables, and others. This research tries to describe the sport education college students’ ability of literacy and statistical reasoning related to the identification of data type, probability, table interpretation, description and explanation by using bar or pie graphic, explanation of variability, interpretation, the calculation and explanation of mean, median, and mode through an instrument. This instrument is tested to 50 college students majoring in sport resulting only 26% of all students have the ability above 30% while others still below 30%. Observing from all subjects; 56% of students have the ability of identification data classification, 49% of students have the ability to read, display and interpret table through graphic, 27% students have the ability in probability, 33% students have the ability to describe variability, and 16.32% students have the ability to read, count and describe mean, median and mode. The result of this research shows that the sport students’ ability of literacy and statistical reasoning has not been adequate and students’ statistical study has not reached comprehending concept, literary ability trining and statistical rasoning, so it is critical to increase the sport students’ ability of literacy and statistical reasoning
Efficient statistical tests to compare Youden index: accounting for contingency correlation.

PubMed

Chen, Fangyao; Xue, Yuqiang; Tan, Ming T; Chen, Pingyan

2015-04-30

Youden index is widely utilized in studies evaluating accuracy of diagnostic tests and performance of predictive, prognostic, or risk models. However, both one and two independent sample tests on Youden index have been derived ignoring the dependence (association) between sensitivity and specificity, resulting in potentially misleading findings. Besides, paired sample test on Youden index is currently unavailable. This article develops efficient statistical inference procedures for one sample, independent, and paired sample tests on Youden index by accounting for contingency correlation, namely associations between sensitivity and specificity and paired samples typically represented in contingency tables. For one and two independent sample tests, the variances are estimated by Delta method, and the statistical inference is based on the central limit theory, which are then verified by bootstrap estimates. For paired samples test, we show that the estimated covariance of the two sensitivities and specificities can be represented as a function of kappa statistic so the test can be readily carried out. We then show the remarkable accuracy of the estimated variance using a constrained optimization approach. Simulation is performed to evaluate the statistical properties of the derived tests. The proposed approaches yield more stable type I errors at the nominal level and substantially higher power (efficiency) than does the original Youden's approach. Therefore, the simple explicit large sample solution performs very well. Because we can readily implement the asymptotic and exact bootstrap computation with common software like R, the method is broadly applicable to the evaluation of diagnostic tests and model performance. Copyright © 2015 John Wiley & Sons, Ltd.
Evaluation of setting time and flow properties of self-synthesize alginate impressions

NASA Astrophysics Data System (ADS)

Halim, Calista; Cahyanto, Arief; Sriwidodo, Harsatiningsih, Zulia

2018-02-01

Alginate is an elastic hydrocolloid dental impression materials to obtain negative reproduction of oral mucosa such as to record soft-tissue and occlusal relationships. The aim of the present study was to synthesize alginate and to determine the setting time and flow properties. There were five groups of alginate consisted of fifty samples self-synthesize alginate and commercial alginate impression product. Fifty samples were divided according to two tests, each twenty-five samples for setting time and flow test. Setting time test was recorded in the s unit, meanwhile, flow test was recorded in the mm2 unit. The fastest setting time result was in the group three (148.8 s) and the latest was group fours). The highest flow test result was in the group three (69.70 mm2) and the lowest was group one (58.34 mm2). Results were analyzed statistically by one way ANOVA (α= 0.05), showed that there was a statistical significance of setting time while no statistical significance of flow properties between self-synthesize alginate and alginate impression product. In conclusion, the alginate impression was successfully self-synthesized and variation composition gives influence toward setting time and flow properties. The most resemble setting time of control group is group three. The most resemble flow of control group is group four.
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test.

PubMed

Swanson, David M; Blacker, Deborah; Alchawa, Taofik; Ludwig, Kerstin U; Mangold, Elisabeth; Lange, Christoph

2013-11-07

The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to "filter" redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.
The extended statistical analysis of toxicity tests using standardised effect sizes (SESs): a comparison of nine published papers.

PubMed

Festing, Michael F W

2014-01-01

The safety of chemicals, drugs, novel foods and genetically modified crops is often tested using repeat-dose sub-acute toxicity tests in rats or mice. It is important to avoid misinterpretations of the results as these tests are used to help determine safe exposure levels in humans. Treated and control groups are compared for a range of haematological, biochemical and other biomarkers which may indicate tissue damage or other adverse effects. However, the statistical analysis and presentation of such data poses problems due to the large number of statistical tests which are involved. Often, it is not clear whether a "statistically significant" effect is real or a false positive (type I error) due to sampling variation. The author's conclusions appear to be reached somewhat subjectively by the pattern of statistical significances, discounting those which they judge to be type I errors and ignoring any biomarker where the p-value is greater than p = 0.05. However, by using standardised effect sizes (SESs) a range of graphical methods and an over-all assessment of the mean absolute response can be made. The approach is an extension, not a replacement of existing methods. It is intended to assist toxicologists and regulators in the interpretation of the results. Here, the SES analysis has been applied to data from nine published sub-acute toxicity tests in order to compare the findings with those of the author's. Line plots, box plots and bar plots show the pattern of response. Dose-response relationships are easily seen. A "bootstrap" test compares the mean absolute differences across dose groups. In four out of seven papers where the no observed adverse effect level (NOAEL) was estimated by the authors, it was set too high according to the bootstrap test, suggesting that possible toxicity is under-estimated.
Drug Testing: A National Controversy.

ERIC Educational Resources Information Center

Stone, Karen; Thompson, Judith R.

1989-01-01

Addresses some of the controversies and illustrates the historical background of drug testing and what the different drug tests are. Also outlines some national statistics and opinions on drug testing and the results of a survey taken of a Louisiana population dealing with this issue. (NB)
Introducing 3D U-statistic method for separating anomaly from background in exploration geochemical data with associated software development

NASA Astrophysics Data System (ADS)

Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir

2016-03-01

The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.
Asymptotic formulae for likelihood-based tests of new physics

NASA Astrophysics Data System (ADS)

Cowan, Glen; Cranmer, Kyle; Gross, Eilam; Vitells, Ofer

2011-02-01

We describe likelihood-based statistical tests for use in high energy physics for the discovery of new phenomena and for construction of confidence intervals on model parameters. We focus on the properties of the test procedures that allow one to account for systematic uncertainties. Explicit formulae for the asymptotic distributions of test statistics are derived using results of Wilks and Wald. We motivate and justify the use of a representative data set, called the "Asimov data set", which provides a simple method to obtain the median experimental sensitivity of a search or measurement as well as fluctuations about this expectation.
Statistics Clinic

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Foy, Millennia; Ploutz-Snyder, Robert; Fiedler, James

2014-01-01

Do you have elevated p-values? Is the data analysis process getting you down? Do you experience anxiety when you need to respond to criticism of statistical methods in your manuscript? You may be suffering from Insufficient Statistical Support Syndrome (ISSS). For symptomatic relief of ISSS, come for a free consultation with JSC biostatisticians at our help desk during the poster sessions at the HRP Investigators Workshop. Get answers to common questions about sample size, missing data, multiple testing, when to trust the results of your analyses and more. Side effects may include sudden loss of statistics anxiety, improved interpretation of your data, and increased confidence in your results.
Improving Non-Destructive Concrete Strength Tests Using Support Vector Machines

PubMed Central

Shih, Yi-Fan; Wang, Yu-Ren; Lin, Kuo-Liang; Chen, Chin-Wen

2015-01-01

Non-destructive testing (NDT) methods are important alternatives when destructive tests are not feasible to examine the in situ concrete properties without damaging the structure. The rebound hammer test and the ultrasonic pulse velocity test are two popular NDT methods to examine the properties of concrete. The rebound of the hammer depends on the hardness of the test specimen and ultrasonic pulse travelling speed is related to density, uniformity, and homogeneity of the specimen. Both of these two methods have been adopted to estimate the concrete compressive strength. Statistical analysis has been implemented to establish the relationship between hammer rebound values/ultrasonic pulse velocities and concrete compressive strength. However, the estimated results can be unreliable. As a result, this research proposes an Artificial Intelligence model using support vector machines (SVMs) for the estimation. Data from 95 cylinder concrete samples are collected to develop and validate the model. The results show that combined NDT methods (also known as SonReb method) yield better estimations than single NDT methods. The results also show that the SVMs model is more accurate than the statistical regression model. PMID:28793627
Nonparametric predictive inference for combining diagnostic tests with parametric copula

NASA Astrophysics Data System (ADS)

Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.

2017-09-01

Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
High-resolution image reconstruction technique applied to the optical testing of ground-based astronomical telescopes

NASA Astrophysics Data System (ADS)

Jin, Zhenyu; Lin, Jing; Liu, Zhong

2008-07-01

By study of the classical testing techniques (such as Shack-Hartmann Wave-front Sensor) adopted in testing the aberration of ground-based astronomical optical telescopes, we bring forward two testing methods on the foundation of high-resolution image reconstruction technology. One is based on the averaged short-exposure OTF and the other is based on the Speckle Interferometric OTF by Antoine Labeyrie. Researches made by J.Ohtsubo, F. Roddier, Richard Barakat and J.-Y. ZHANG indicated that the SITF statistical results would be affected by the telescope optical aberrations, which means the SITF statistical results is a function of optical system aberration and the atmospheric Fried parameter (seeing). Telescope diffraction-limited information can be got through two statistics methods of abundant speckle images: by the first method, we can extract the low frequency information such as the full width at half maximum (FWHM) of the telescope PSF to estimate the optical quality; by the second method, we can get a more precise description of the telescope PSF with high frequency information. We will apply the two testing methods to the 2.4m optical telescope of the GMG Observatory, in china to validate their repeatability and correctness and compare the testing results with that of the Shack-Hartmann Wave-Front Sensor got. This part will be described in detail in our paper.
Factors related to student performance in statistics courses in Lebanon

NASA Astrophysics Data System (ADS)

Naccache, Hiba Salim

The purpose of the present study was to identify factors that may contribute to business students in Lebanese universities having difficulty in introductory and advanced statistics courses. Two statistics courses are required for business majors at Lebanese universities. Students are not obliged to be enrolled in any math courses prior to taking statistics courses. Drawing on recent educational research, this dissertation attempted to identify the relationship between (1) students’ scores on Lebanese university math admissions tests; (2) students’ scores on a test of very basic mathematical concepts; (3) students’ scores on the survey of attitude toward statistics (SATS); (4) course performance as measured by students’ final scores in the course; and (5) their scores on the final exam. Data were collected from 561 students enrolled in multiple sections of two courses: 307 students in the introductory statistics course and 260 in the advanced statistics course in seven campuses across Lebanon over one semester. The multiple regressions results revealed four significant relationships at the introductory level: between students’ scores on the math quiz with their (1) final exam scores; (2) their final averages; (3) the Cognitive subscale of the SATS with their final exam scores; and (4) their final averages. These four significant relationships were also found at the advanced level. In addition, two more significant relationships were found between students’ final average and the two subscales of Effort (5) and Affect (6). No relationship was found between students’ scores on the admission math tests and both their final exam scores and their final averages in both the introductory and advanced level courses. On the other hand, there was no relationship between students’ scores on Lebanese admissions tests and their final achievement. Although these results were consistent across course formats and instructors, they may encourage Lebanese universities to assess the effectiveness of prerequisite math courses. Moreover, these findings may lead the Lebanese Ministry of Education to make changes to the admissions exams, course prerequisites, and course content. Finally, to enhance the attitude of students, new learning techniques, such as group work during class meetings can be helpful, and future research should aim to test the effectiveness of these pedagogical techniques on students’ attitudes toward statistics.
Sequential CFAR detectors using a dead-zone limiter

NASA Astrophysics Data System (ADS)

Tantaratana, Sawasd

1990-09-01

The performances of some proposed sequential constant-false-alarm-rate (CFAR) detectors are evaluated. The observations are passed through a dead-zone limiter, the output of which is -1, 0, or +1, depending on whether the input is less than -c, between -c and c, or greater than c, where c is a constant. The test statistic is the sum of the outputs. The test is performed on a reduced set of data (those with absolute value larger than c), with the test statistic being the sum of the signs of the reduced set of data. Both constant and linear boundaries are considered. Numerical results show a significant reduction of the average number of observations needed to achieve the same false alarm and detection probabilities as a fixed-sample-size CFAR detector using the same kind of test statistic.
Testing of Hypothesis in Equivalence and Non Inferiority Trials-A Concept.

PubMed

Juneja, Atul; Aggarwal, Abha R; Adhikari, Tulsi; Pandey, Arvind

2016-04-01

Establishing the appropriate hypothesis is one of the important steps for carrying out the statistical tests/analysis. Its understanding is important for interpreting the results of statistical analysis. The current communication attempts to provide the concept of testing of hypothesis in non inferiority and equivalence trials, where the null hypothesis is just reverse of what is set up for conventional superiority trials. It is similarly looked for rejection for establishing the fact the researcher is intending to prove. It is important to mention that equivalence or non inferiority cannot be proved by accepting the null hypothesis of no difference. Hence, establishing the appropriate statistical hypothesis is extremely important to arrive at meaningful conclusion for the set objectives in research.
Evaluation of PDA Technical Report No 33. Statistical Testing Recommendations for a Rapid Microbiological Method Case Study.

PubMed

Murphy, Thomas; Schwedock, Julie; Nguyen, Kham; Mills, Anna; Jones, David

2015-01-01

New recommendations for the validation of rapid microbiological methods have been included in the revised Technical Report 33 release from the PDA. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This case study applies those statistical methods to accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological methods system being evaluated for water bioburden testing. Results presented demonstrate that the statistical methods described in the PDA Technical Report 33 chapter can all be successfully applied to the rapid microbiological method data sets and gave the same interpretation for equivalence to the standard method. The rapid microbiological method was in general able to pass the requirements of PDA Technical Report 33, though the study shows that there can be occasional outlying results and that caution should be used when applying statistical methods to low average colony-forming unit values. Prior to use in a quality-controlled environment, any new method or technology has to be shown to work as designed by the manufacturer for the purpose required. For new rapid microbiological methods that detect and enumerate contaminating microorganisms, additional recommendations have been provided in the revised PDA Technical Report No. 33. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This paper applies those statistical methods to analyze accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological method system being validated for water bioburden testing. The case study demonstrates that the statistical methods described in the PDA Technical Report No. 33 chapter can be successfully applied to rapid microbiological method data sets and give the same comparability results for similarity or difference as the standard method. © PDA, Inc. 2015.
Statistical methods for the quality control of steam cured concrete : final report.

DOT National Transportation Integrated Search

1971-01-01

Concrete strength test results from three prestressing plants utilizing steam curing were evaluated statistically in terms of the concrete as received and the effectiveness of the plants' steaming procedures. Control charts were prepared to show tren...
The longevity of statistical learning: When infant memory decays, isolated words come to the rescue.

PubMed

Karaman, Ferhat; Hay, Jessica F

2018-02-01

Research over the past 2 decades has demonstrated that infants are equipped with remarkable computational abilities that allow them to find words in continuous speech. Infants can encode information about the transitional probability (TP) between syllables to segment words from artificial and natural languages. As previous research has tested infants immediately after familiarization, infants' ability to retain sequential statistics beyond the immediate familiarization context remains unknown. Here, we examine infants' memory for statistically defined words 10 min after familiarization with an Italian corpus. Eight-month-old English-learning infants were familiarized with Italian sentences that contained 4 embedded target words-2 words had high internal TP (HTP, TP = 1.0) and 2 had low TP (LTP, TP = .33)-and were tested on their ability to discriminate HTP from LTP words using the Headturn Preference Procedure. When tested after a 10-min delay, infants failed to discriminate HTP from LTP words, suggesting that memory for statistical information likely decays over even short delays (Experiment 1). Experiments 2-4 were designed to test whether experience with isolated words selectively reinforces memory for statistically defined (i.e., HTP) words. When 8-month-olds were given additional experience with isolated tokens of both HTP and LTP words immediately after familiarization, they looked significantly longer on HTP than LTP test trials 10 min later. Although initial representations of statistically defined words may be fragile, our results suggest that experience with isolated words may reinforce the output of statistical learning by helping infants create more robust memories for words with strong versus weak co-occurrence statistics. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Potential errors and misuse of statistics in studies on leakage in endodontics.

PubMed

Lucena, C; Lopez, J M; Pulgar, R; Abalos, C; Valderrama, M J

2013-04-01

To assess the quality of the statistical methodology used in studies of leakage in Endodontics, and to compare the results found using appropriate versus inappropriate inferential statistical methods. The search strategy used the descriptors 'root filling' 'microleakage', 'dye penetration', 'dye leakage', 'polymicrobial leakage' and 'fluid filtration' for the time interval 2001-2010 in journals within the categories 'Dentistry, Oral Surgery and Medicine' and 'Materials Science, Biomaterials' of the Journal Citation Report. All retrieved articles were reviewed to find potential pitfalls in statistical methodology that may be encountered during study design, data management or data analysis. The database included 209 papers. In all the studies reviewed, the statistical methods used were appropriate for the category attributed to the outcome variable, but in 41% of the cases, the chi-square test or parametric methods were inappropriately selected subsequently. In 2% of the papers, no statistical test was used. In 99% of cases, a statistically 'significant' or 'not significant' effect was reported as a main finding, whilst only 1% also presented an estimation of the magnitude of the effect. When the appropriate statistical methods were applied in the studies with originally inappropriate data analysis, the conclusions changed in 19% of the cases. Statistical deficiencies in leakage studies may affect their results and interpretation and might be one of the reasons for the poor agreement amongst the reported findings. Therefore, more effort should be made to standardize statistical methodology. © 2012 International Endodontic Journal.

A weighted generalized score statistic for comparison of predictive values of diagnostic tests.

PubMed

Kosinski, Andrzej S

2013-03-15

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations that are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we presented, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic that incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, always reduces to the score statistic in the independent samples situation, and preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe that the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the WGS test statistic in a general GEE setting. Copyright © 2012 John Wiley & Sons, Ltd.
A weighted generalized score statistic for comparison of predictive values of diagnostic tests

PubMed Central

Kosinski, Andrzej S.

2013-01-01

Positive and negative predictive values are important measures of a medical diagnostic test performance. We consider testing equality of two positive or two negative predictive values within a paired design in which all patients receive two diagnostic tests. The existing statistical tests for testing equality of predictive values are either Wald tests based on the multinomial distribution or the empirical Wald and generalized score tests within the generalized estimating equations (GEE) framework. As presented in the literature, these test statistics have considerably complex formulas without clear intuitive insight. We propose their re-formulations which are mathematically equivalent but algebraically simple and intuitive. As is clearly seen with a new re-formulation we present, the generalized score statistic does not always reduce to the commonly used score statistic in the independent samples case. To alleviate this, we introduce a weighted generalized score (WGS) test statistic which incorporates empirical covariance matrix with newly proposed weights. This statistic is simple to compute, it always reduces to the score statistic in the independent samples situation, and it preserves type I error better than the other statistics as demonstrated by simulations. Thus, we believe the proposed WGS statistic is the preferred statistic for testing equality of two predictive values and for corresponding sample size computations. The new formulas of the Wald statistics may be useful for easy computation of confidence intervals for difference of predictive values. The introduced concepts have potential to lead to development of the weighted generalized score test statistic in a general GEE setting. PMID:22912343
Results of the Intelligence Test for Visually Impaired Children (ITVIC).

ERIC Educational Resources Information Center

Dekker, R.; And Others

1991-01-01

Statistical analyses of scores on subtests of the Intelligence Test for Visually Impaired Children were done for two groups of children, either with or without usable vision. Results suggest that the battery has differential factorial and predictive validity. (Author/DB)
Measuring an Effect Size from Dichotomized Data: Contrasted Results Whether Using a Correlation or an Odds Ratio

ERIC Educational Resources Information Center

Rousson, Valentin

2014-01-01

It is well known that dichotomizing continuous data has the effect to decrease statistical power when the goal is to test for a statistical association between two variables. Modern researchers however are focusing not only on statistical significance but also on an estimation of the "effect size" (i.e., the strength of association…
'Chain pooling' model selection as developed for the statistical analysis of a rotor burst protection experiment

NASA Technical Reports Server (NTRS)

Holms, A. G.

1977-01-01

A statistical decision procedure called chain pooling had been developed for model selection in fitting the results of a two-level fixed-effects full or fractional factorial experiment not having replication. The basic strategy included the use of one nominal level of significance for a preliminary test and a second nominal level of significance for the final test. The subject has been reexamined from the point of view of using as many as three successive statistical model deletion procedures in fitting the results of a single experiment. The investigation consisted of random number studies intended to simulate the results of a proposed aircraft turbine-engine rotor-burst-protection experiment. As a conservative approach, population model coefficients were chosen to represent a saturated 2 to the 4th power experiment with a distribution of parameter values unfavorable to the decision procedures. Three model selection strategies were developed.
A study of correlations between crude oil spot and futures markets: A rolling sample test

NASA Astrophysics Data System (ADS)

Liu, Li; Wan, Jieqiu

2011-10-01

In this article, we investigate the asymmetries of exceedance correlations and cross-correlations between West Texas Intermediate (WTI) spot and futures markets. First, employing the test statistic proposed by Hong et al. [Asymmetries in stock returns: statistical tests and economic evaluation, Review of Financial Studies 20 (2007) 1547-1581], we find that the exceedance correlations were overall symmetric. However, the results from rolling windows show that some occasional events could induce the significant asymmetries of the exceedance correlations. Second, employing the test statistic proposed by Podobnik et al. [Quantifying cross-correlations using local and global detrending approaches, European Physics Journal B 71 (2009) 243-250], we find that the cross-correlations were significant even for large lagged orders. Using the detrended cross-correlation analysis proposed by Podobnik and Stanley [Detrended cross-correlation analysis: a new method for analyzing two nonstationary time series, Physics Review Letters 100 (2008) 084102], we find that the cross-correlations were weakly persistent and were stronger between spot and futures contract with larger maturity. Our results from rolling sample test also show the apparent effects of the exogenous events. Additionally, we have some relevant discussions on the obtained evidence.
Meta-analysis of diagnostic test data: a bivariate Bayesian modeling approach.

PubMed

Verde, Pablo E

2010-12-30

In the last decades, the amount of published results on clinical diagnostic tests has expanded very rapidly. The counterpart to this development has been the formal evaluation and synthesis of diagnostic results. However, published results present substantial heterogeneity and they can be regarded as so far removed from the classical domain of meta-analysis, that they can provide a rather severe test of classical statistical methods. Recently, bivariate random effects meta-analytic methods, which model the pairs of sensitivities and specificities, have been presented from the classical point of view. In this work a bivariate Bayesian modeling approach is presented. This approach substantially extends the scope of classical bivariate methods by allowing the structural distribution of the random effects to depend on multiple sources of variability. Meta-analysis is summarized by the predictive posterior distributions for sensitivity and specificity. This new approach allows, also, to perform substantial model checking, model diagnostic and model selection. Statistical computations are implemented in the public domain statistical software (WinBUGS and R) and illustrated with real data examples. Copyright © 2010 John Wiley & Sons, Ltd.
A comparison of likelihood ratio tests and Rao's score test for three separable covariance matrix structures.

PubMed

Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha

2017-01-01

The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Biostatistics Series Module 2: Overview of Hypothesis Testing.

PubMed

Hazra, Avijit; Gogtay, Nithya

2016-01-01

Hypothesis testing (or statistical inference) is one of the major applications of biostatistics. Much of medical research begins with a research question that can be framed as a hypothesis. Inferential statistics begins with a null hypothesis that reflects the conservative position of no change or no difference in comparison to baseline or between groups. Usually, the researcher has reason to believe that there is some effect or some difference which is the alternative hypothesis. The researcher therefore proceeds to study samples and measure outcomes in the hope of generating evidence strong enough for the statistician to be able to reject the null hypothesis. The concept of the P value is almost universally used in hypothesis testing. It denotes the probability of obtaining by chance a result at least as extreme as that observed, even when the null hypothesis is true and no real difference exists. Usually, if P is < 0.05 the null hypothesis is rejected and sample results are deemed statistically significant. With the increasing availability of computers and access to specialized statistical software, the drudgery involved in statistical calculations is now a thing of the past, once the learning curve of the software has been traversed. The life sciences researcher is therefore free to devote oneself to optimally designing the study, carefully selecting the hypothesis tests to be applied, and taking care in conducting the study well. Unfortunately, selecting the right test seems difficult initially. Thinking of the research hypothesis as addressing one of five generic research questions helps in selection of the right hypothesis test. In addition, it is important to be clear about the nature of the variables (e.g., numerical vs. categorical; parametric vs. nonparametric) and the number of groups or data sets being compared (e.g., two or more than two) at a time. The same research question may be explored by more than one type of hypothesis test. While this may be of utility in highlighting different aspects of the problem, merely reapplying different tests to the same issue in the hope of finding a P < 0.05 is a wrong use of statistics. Finally, it is becoming the norm that an estimate of the size of any effect, expressed with its 95% confidence interval, is required for meaningful interpretation of results. A large study is likely to have a small (and therefore "statistically significant") P value, but a "real" estimate of the effect would be provided by the 95% confidence interval. If the intervals overlap between two interventions, then the difference between them is not so clear-cut even if P < 0.05. The two approaches are now considered complementary to one another.
Biostatistics Series Module 2: Overview of Hypothesis Testing

PubMed Central

Hazra, Avijit; Gogtay, Nithya

2016-01-01

Hypothesis testing (or statistical inference) is one of the major applications of biostatistics. Much of medical research begins with a research question that can be framed as a hypothesis. Inferential statistics begins with a null hypothesis that reflects the conservative position of no change or no difference in comparison to baseline or between groups. Usually, the researcher has reason to believe that there is some effect or some difference which is the alternative hypothesis. The researcher therefore proceeds to study samples and measure outcomes in the hope of generating evidence strong enough for the statistician to be able to reject the null hypothesis. The concept of the P value is almost universally used in hypothesis testing. It denotes the probability of obtaining by chance a result at least as extreme as that observed, even when the null hypothesis is true and no real difference exists. Usually, if P is < 0.05 the null hypothesis is rejected and sample results are deemed statistically significant. With the increasing availability of computers and access to specialized statistical software, the drudgery involved in statistical calculations is now a thing of the past, once the learning curve of the software has been traversed. The life sciences researcher is therefore free to devote oneself to optimally designing the study, carefully selecting the hypothesis tests to be applied, and taking care in conducting the study well. Unfortunately, selecting the right test seems difficult initially. Thinking of the research hypothesis as addressing one of five generic research questions helps in selection of the right hypothesis test. In addition, it is important to be clear about the nature of the variables (e.g., numerical vs. categorical; parametric vs. nonparametric) and the number of groups or data sets being compared (e.g., two or more than two) at a time. The same research question may be explored by more than one type of hypothesis test. While this may be of utility in highlighting different aspects of the problem, merely reapplying different tests to the same issue in the hope of finding a P < 0.05 is a wrong use of statistics. Finally, it is becoming the norm that an estimate of the size of any effect, expressed with its 95% confidence interval, is required for meaningful interpretation of results. A large study is likely to have a small (and therefore “statistically significant”) P value, but a “real” estimate of the effect would be provided by the 95% confidence interval. If the intervals overlap between two interventions, then the difference between them is not so clear-cut even if P < 0.05. The two approaches are now considered complementary to one another. PMID:27057011
Initial experience with non-invasive prenatal testing of cell-free DNA for major chromosomal anomalies in a clinical setting.

PubMed

Comas, Carmina; Echevarria, Mónica; Rodríguez, M Angeles; Prats, Pilar; Rodríguez, Ignacio; Serra, Bernat

2015-07-01

To evaluate non-invasive prenatal testing (NIPT) of cell-free DNA (cfDNA) as a screening method for major chromosomal anomalies (CA) in a clinical setting. From January to December 2013, Panorama™ test or Harmony™ prenatal test were offered as advanced NIPT, in addition to first-trimester combined screening in singleton pregnancies. The cohort included 333 pregnant women with a mean maternal age (MA) of 37 years who underwent testing at a mean gestational age of 14.6 weeks. Eighty-four percent were low-risk pregnancies. Results were provided in 97.3% of patients at a mean reporting time of 12.9 calendar days. Repeat sampling was performed in six cases and results were obtained in five of them. No results were provided in four cases. Four cases of Down syndrome were detected and there was one discordant result of Turner syndrome. We found no statistical differences between commercial tests except in reporting time, fetal fraction and MA. The cfDNA fraction was statistically associated with test type, maternal weight, BMI and log βhCG levels. NIPT has the potential to be a highly effective screening method for major CA in a clinical setting.
Using Linguistic Knowledge in Statistical Machine Translation

DTIC Science & Technology

2010-09-01

on newswire test data . . . . . . . . . . . . . . . . . . . . . 65 3.4 Arabic to English MT results for Arabic morphological segmentation, measured on...web test data. . . . . . . . . . . . . . . . . . . . . . . . 65 3.5 Recombination Results. Percentage of sentences with mis-combined words...scores for syntactic reordering of the Spoken Language Domain. 90 5.1 Normalized likelihood of the test set alignments without decision trees, and then
Accelerated test program

NASA Technical Reports Server (NTRS)

Ford, F. E.; Harkness, J. M.

1977-01-01

A brief discussion on the accelerated testing of batteries is given. The statistical analysis and the various aspects of the modeling that was done and the results attained from the model are also briefly discussed.
Statistical analysis of Skylab 3. [endocrine/metabolic studies of astronauts

NASA Technical Reports Server (NTRS)

Johnston, D. A.

1974-01-01

The results of endocrine/metabolic studies of astronauts on Skylab 3 are reported. One-way analysis of variance, contrasts, two-way unbalanced analysis of variance, and analysis of periodic changes in flight are included. Results for blood tests, and urine tests are presented.
Adhesive properties and adhesive joints strength of graphite/epoxy composites

NASA Astrophysics Data System (ADS)

Rudawska, Anna; Stančeková, Dana; Cubonova, Nadezda; Vitenko, Tetiana; Müller, Miroslav; Valášek, Petr

2017-05-01

The article presents the results of experimental research of the adhesive joints strength of graphite/epoxy composites and the results of the surface free energy of the composite surfaces. Two types of graphite/epoxy composites with different thickness were tested which are used to aircraft structure. The single-lap adhesive joints of epoxy composites were considered. Adhesive properties were described by surface free energy. Owens-Wendt method was used to determine surface free energy. The epoxy two-component adhesive was used to preparing the adhesive joints. Zwick/Roell 100 strength device were used to determination the shear strength of adhesive joints of epoxy composites. The strength test results showed that the highest value was obtained for adhesive joints of graphite-epoxy composite of smaller material thickness (0.48 mm). Statistical analysis of the results obtained, the study showed statistically significant differences between the values of the strength of the confidence level of 0.95. The statistical analysis of the results also showed that there are no statistical significant differences in average values of surface free energy (0.95 confidence level). It was noted that in each of the results the dispersion component of surface free energy was much greater than polar component of surface free energy.
Data-driven inference for the spatial scan statistic.

PubMed

Almeida, Alexandre C L; Duarte, Anderson R; Duczmal, Luiz H; Oliveira, Fernando L P; Takahashi, Ricardo H C

2011-08-02

Kulldorff's spatial scan statistic for aggregated area maps searches for clusters of cases without specifying their size (number of areas) or geographic location in advance. Their statistical significance is tested while adjusting for the multiple testing inherent in such a procedure. However, as is shown in this work, this adjustment is not done in an even manner for all possible cluster sizes. A modification is proposed to the usual inference test of the spatial scan statistic, incorporating additional information about the size of the most likely cluster found. A new interpretation of the results of the spatial scan statistic is done, posing a modified inference question: what is the probability that the null hypothesis is rejected for the original observed cases map with a most likely cluster of size k, taking into account only those most likely clusters of size k found under null hypothesis for comparison? This question is especially important when the p-value computed by the usual inference process is near the alpha significance level, regarding the correctness of the decision based in this inference. A practical procedure is provided to make more accurate inferences about the most likely cluster found by the spatial scan statistic.
Reproducibility-optimized test statistic for ranking genes in microarray studies.

PubMed

Elo, Laura L; Filén, Sanna; Lahesmaa, Riitta; Aittokallio, Tero

2008-01-01

A principal goal of microarray studies is to identify the genes showing differential expression under distinct conditions. In such studies, the selection of an optimal test statistic is a crucial challenge, which depends on the type and amount of data under analysis. While previous studies on simulated or spike-in datasets do not provide practical guidance on how to choose the best method for a given real dataset, we introduce an enhanced reproducibility-optimization procedure, which enables the selection of a suitable gene- anking statistic directly from the data. In comparison with existing ranking methods, the reproducibilityoptimized statistic shows good performance consistently under various simulated conditions and on Affymetrix spike-in dataset. Further, the feasibility of the novel statistic is confirmed in a practical research setting using data from an in-house cDNA microarray study of asthma-related gene expression changes. These results suggest that the procedure facilitates the selection of an appropriate test statistic for a given dataset without relying on a priori assumptions, which may bias the findings and their interpretation. Moreover, the general reproducibilityoptimization procedure is not limited to detecting differential expression only but could be extended to a wide range of other applications as well.
North American Contact Dermatitis Group patch test results: 2009 to 2010.

PubMed

Warshaw, Erin M; Belsito, Donald V; Taylor, James S; Sasseville, Denis; DeKoven, Joel G; Zirwas, Matthew J; Fransway, Anthony F; Mathias, C G Toby; Zug, Kathryn A; DeLeo, Vincent A; Fowler, Joseph F; Marks, James G; Pratt, Melanie D; Storrs, Frances J; Maibach, Howard I

2013-01-01

Patch testing is an important diagnostic tool for determination of substances responsible for allergic contact dermatitis. This study reports the North American Contact Dermatitis Group (NACDG) patch testing results from January 1, 2009, to December 31, 2010. At 12 centers in North America, patients were tested in a standardized manner with a screening series of 70 allergens. Data were manually verified and entered into a central database. Descriptive frequencies were calculated, and trends were analyzed using χ2 statistics. A total of 4308 patients were tested. Of these, 2614 (60.7%) had at least 1 positive reaction, and 2284 (46.3%) were ultimately determined to have a primary diagnosis of allergic contact dermatitis. Four hundred twenty-seven (9.9%) patients had occupationally related skin disease. There were 6855 positive allergic reactions. As compared with the previous reporting period (2007-2008), the positive reaction rates statistically decreased for 20 allergens (nickel, neomycin, Myroxylon pereirae, cobalt, formaldehyde, quaternium 15, methydibromoglutaronitrile/phenoxyethanol, methylchlorisothiazolinone/methylisothiazolinone, potassium dichromate, diazolidinyl urea, propolis, dimethylol dimethylhydantoin, 2-bromo-2-nitro-1,3-propanediol, methyl methacrylate, ethyl acrylate, glyceryl thioglycolate, dibucaine, amidoamine, clobetasol, and dimethyloldihydroxyethyleneurea; P < 0.05) and statistically increased for 4 allergens (fragrance mix II, iodopropynyl butylcarbamate, propylene glycol, and benzocaine; P < 0.05). Approximately one quarter of tested patients had at least 1 relevant allergic reaction to a non-NACDG allergen. Hypothetically, approximately one quarter of reactions detected by NACDG allergens would have been missed by TRUE TEST (SmartPractice Denmark, Hillerød, Denmark). These results affirm the value of patch testing with many allergens.
Cognitive Complaints After Breast Cancer Treatments: Examining the Relationship With Neuropsychological Test Performance

PubMed Central

2013-01-01

Background Cognitive complaints are reported frequently after breast cancer treatments. Their association with neuropsychological (NP) test performance is not well-established. Methods Early-stage, posttreatment breast cancer patients were enrolled in a prospective, longitudinal, cohort study prior to starting endocrine therapy. Evaluation included an NP test battery and self-report questionnaires assessing symptoms, including cognitive complaints. Multivariable regression models assessed associations among cognitive complaints, mood, treatment exposures, and NP test performance. Results One hundred eighty-nine breast cancer patients, aged 21–65 years, completed the evaluation; 23.3% endorsed higher memory complaints and 19.0% reported higher executive function complaints (>1 SD above the mean for healthy control sample). Regression modeling demonstrated a statistically significant association of higher memory complaints with combined chemotherapy and radiation treatments (P = .01), poorer NP verbal memory performance (P = .02), and higher depressive symptoms (P < .001), controlling for age and IQ. For executive functioning complaints, multivariable modeling controlling for age, IQ, and other confounds demonstrated statistically significant associations with better NP visual memory performance (P = .03) and higher depressive symptoms (P < .001), whereas combined chemotherapy and radiation treatment (P = .05) approached statistical significance. Conclusions About one in five post–adjuvant treatment breast cancer patients had elevated memory and/or executive function complaints that were statistically significantly associated with domain-specific NP test performances and depressive symptoms; combined chemotherapy and radiation treatment was also statistically significantly associated with memory complaints. These results and other emerging studies suggest that subjective cognitive complaints in part reflect objective NP performance, although their etiology and biology appear to be multifactorial, motivating further transdisciplinary research. PMID:23606729
Radiographic comparison of different concentrations of recombinant human bone morphogenetic protein with allogenic bone compared with the use of 100% mineralized cancellous bone allograft in maxillary sinus grafting.

PubMed

Froum, Stuart J; Wallace, Stephen; Cho, Sang-Choon; Khouly, Ismael; Rosenberg, Edwin; Corby, Patricia; Froum, Scott; Mascarenhas, Patrick; Tarnow, Dennis P

2014-01-01

The purpose of this study was to radiographically evaluate, then analyze, bone height, volume, and density with reference to percentage of vital bone after maxillary sinuses were grafted using two different doses of recombinant human bone morphogenetic protein 2/acellular collagen sponge (rhBMP-2/ACS) combined with mineralized cancellous bone allograft (MCBA) and a control sinus grafted with MCBA only. A total of 18 patients (36 sinuses) were used for analysis of height and volume measurements, having two of three graft combinations (one in each sinus): (1) control, MCBA only; (2) test 1, MCBA + 5.6 mL of rhBMP-2/ACS (containing 8.4 mg of rhBMP-2); and (3) test 2, MCBA + 2.8 mL of rhBMP-2/ACS (containing 4.2 mg of rhBMP-2). The study was completed with 16 patients who also had bilateral cores removed 6 to 9 months following sinus augmentation. A computer software system was used to evaluate 36 computed tomography scans. Two time points where selected for measurements of height: The results indicated that height of the grafted sinus was significantly greater in the treatment groups compared with the control. However, by the second time point, there were no statistically significant differences. Three weeks post-surgery bone volume measurements showed similar statistically significant differences between test and controls. However, prior to core removal, test group 1 with the greater dose of rhBMP-2 showed a statistically significant greater increase compared with test group 2 and the control. There was no statistically significant difference between the latter two groups. All three groups had similar volume and shrinkage. Density measurements varied from the above results, with the control showing statistically significant greater density at both time points. By contrast, the density increase over time in both rhBMP groups was similar and statistically higher than in the control group. There were strong associations between height and volume in all groups and between volume and new vital bone only in the control group. There were no statistically significant relationships observed between height and bone density or between volume and bone density for any parameter measured. More cases and monitoring of the future survival of implants placed in these augmented sinuses are needed to verify these results.

Sound texture perception via statistics of the auditory periphery: Evidence from sound synthesis

PubMed Central

McDermott, Josh H.; Simoncelli, Eero P.

2014-01-01

Rainstorms, insect swarms, and galloping horses produce “sound textures” – the collective result of many similar acoustic events. Sound textures are distinguished by temporal homogeneity, suggesting they could be recognized with time-averaged statistics. To test this hypothesis, we processed real-world textures with an auditory model containing filters tuned for sound frequencies and their modulations, and measured statistics of the resulting decomposition. We then assessed the realism and recognizability of novel sounds synthesized to have matching statistics. Statistics of individual frequency channels, capturing spectral power and sparsity, generally failed to produce compelling synthetic textures. However, combining them with correlations between channels produced identifiable and natural-sounding textures. Synthesis quality declined if statistics were computed from biologically implausible auditory models. The results suggest that sound texture perception is mediated by relatively simple statistics of early auditory representations, presumably computed by downstream neural populations. The synthesis methodology offers a powerful tool for their further investigation. PMID:21903084
The breaking load method - Results and statistical modification from the ASTM interlaboratory test program

NASA Technical Reports Server (NTRS)

Colvin, E. L.; Emptage, M. R.

1992-01-01

The breaking load test provides quantitative stress corrosion cracking data by determining the residual strength of tension specimens that have been exposed to corrosive environments. Eight laboratories have participated in a cooperative test program under the auspices of ASTM Committee G-1 to evaluate the new test method. All eight laboratories were able to distinguish between three tempers of aluminum alloy 7075. The statistical analysis procedures that were used in the test program do not work well in all situations. An alternative procedure using Box-Cox transformations shows a great deal of promise. An ASTM standard method has been drafted which incorporates the Box-Cox procedure.
Application of modified profile analysis to function testing of the motion/no-motion issue in an aircraft ground-handling simulation. [statistical analysis procedure for man machine systems flight simulation

NASA Technical Reports Server (NTRS)

Parrish, R. V.; Mckissick, B. T.; Steinmetz, G. G.

1979-01-01

A recent modification of the methodology of profile analysis, which allows the testing for differences between two functions as a whole with a single test, rather than point by point with multiple tests is discussed. The modification is applied to the examination of the issue of motion/no motion conditions as shown by the lateral deviation curve as a function of engine cut speed of a piloted 737-100 simulator. The results of this application are presented along with those of more conventional statistical test procedures on the same simulator data.
EXTENDING MULTIVARIATE DISTANCE MATRIX REGRESSION WITH AN EFFECT SIZE MEASURE AND THE ASYMPTOTIC NULL DISTRIBUTION OF THE TEST STATISTIC

PubMed Central

McArtor, Daniel B.; Lubke, Gitta H.; Bergeman, C. S.

2017-01-01

Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains. PMID:27738957
Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

PubMed

McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

2017-12-01

Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.
Comparison of drug-induced sleep endoscopy and Müller's maneuver in diagnosing obstructive sleep apnea using the VOTE classification system.

PubMed

Yegïn, Yakup; Çelik, Mustafa; Kaya, Kamïl Hakan; Koç, Arzu Karaman; Kayhan, Fatma Tülin

Knowledge of the site of obstruction and the pattern of airway collapse is essential for determining correct surgical and medical management of patients with Obstructive Sleep Apnea Syndrome (OSAS). To this end, several diagnostic tests and procedures have been developed. To determine whether drug-induced sleep endoscopy (DISE) or Müller's maneuver (MM) would be more successful at identifying the site of obstruction and the pattern of upper airway collapse in patients with OSAS. The study included 63 patients (52 male and 11 female) who were diagnosed with OSAS at our clinic. Ages ranged from 30 to 66 years old and the average age was 48.5 years. All patients underwent DISE and MM and the results of these examinations were characterized according to the region/degree of obstruction as well as the VOTE classification. The results of each test were analyzed per upper airway level and compared using statistical analysis (Cohen's kappa statistic test). There was statistically significant concordance between the results from DISE and MM for procedures involving the anteroposterior (73%), lateral (92.1%), and concentric (74.6%) configuration of the velum. Results from the lateral part of the oropharynx were also in concordance between the tests (58.7%). Results from the lateral configuration of the epiglottis were in concordance between the tests (87.3%). There was no statistically significant concordance between the two examinations for procedures involving the anteroposterior of the tongue (23.8%) and epiglottis (42.9%). We suggest that DISE has several advantages including safety, ease of use, and reliability, which outweigh MM in terms of the ability to diagnose sites of obstruction and the pattern of upper airway collapse. Also, MM can provide some knowledge of the pattern of pharyngeal collapse. Furthermore, we also recommend using the VOTE classification in combination with DISE. Copyright © 2016 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Congruence analysis of geodetic networks - hypothesis tests versus model selection by information criteria

NASA Astrophysics Data System (ADS)

Lehmann, Rüdiger; Lösler, Michael

2017-12-01

Geodetic deformation analysis can be interpreted as a model selection problem. The null model indicates that no deformation has occurred. It is opposed to a number of alternative models, which stipulate different deformation patterns. A common way to select the right model is the usage of a statistical hypothesis test. However, since we have to test a series of deformation patterns, this must be a multiple test. As an alternative solution for the test problem, we propose the p-value approach. Another approach arises from information theory. Here, the Akaike information criterion (AIC) or some alternative is used to select an appropriate model for a given set of observations. Both approaches are discussed and applied to two test scenarios: A synthetic levelling network and the Delft test data set. It is demonstrated that they work but behave differently, sometimes even producing different results. Hypothesis tests are well-established in geodesy, but may suffer from an unfavourable choice of the decision error rates. The multiple test also suffers from statistical dependencies between the test statistics, which are neglected. Both problems are overcome by applying information criterions like AIC.
Explorations in Statistics: Hypothesis Tests and P Values

ERIC Educational Resources Information Center

Curran-Everett, Douglas

2009-01-01

Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This second installment of "Explorations in Statistics" delves into test statistics and P values, two concepts fundamental to the test of a scientific null hypothesis. The essence of a test statistic is that it compares what…
A statistical assessment of differences and equivalences between genetically modified and reference plant varieties

PubMed Central

2011-01-01

Background Safety assessment of genetically modified organisms is currently often performed by comparative evaluation. However, natural variation of plant characteristics between commercial varieties is usually not considered explicitly in the statistical computations underlying the assessment. Results Statistical methods are described for the assessment of the difference between a genetically modified (GM) plant variety and a conventional non-GM counterpart, and for the assessment of the equivalence between the GM variety and a group of reference plant varieties which have a history of safe use. It is proposed to present the results of both difference and equivalence testing for all relevant plant characteristics simultaneously in one or a few graphs, as an aid for further interpretation in safety assessment. A procedure is suggested to derive equivalence limits from the observed results for the reference plant varieties using a specific implementation of the linear mixed model. Three different equivalence tests are defined to classify any result in one of four equivalence classes. The performance of the proposed methods is investigated by a simulation study, and the methods are illustrated on compositional data from a field study on maize grain. Conclusions A clear distinction of practical relevance is shown between difference and equivalence testing. The proposed tests are shown to have appropriate performance characteristics by simulation, and the proposed simultaneous graphical representation of results was found to be helpful for the interpretation of results from a practical field trial data set. PMID:21324199
[Therapy of organic brain syndrome with nicergoline given once a day].

PubMed

Ladurner, G; Erhart, P; Erhart, C; Scheiber, V

1991-01-01

In a double-blind, active-controlled study 30 patients with mild to moderate multiinfarct dementia diagnosed according to DSM III definition were treated by either 20 mg nicergoline or 4.5 mg co-dergocrine mesilate once daily during eight weeks. Therapeutic effects on symptoms of the organic brain syndrome were quantitatively measured by standardized psychological and psychometric methods evaluating cognitive and thymopsychic functions. Main criteria, which were tested by inferential analysis, were SCAG total score (Sandoz Clinical Assessment Geriatric Scale), SCAG overall impression and the AD Test (alphabetischer Durchstreichtest). Other results were assessed by descriptive statistics. Both treatments resulted in a statistically significant improvement in most of the tested functions. The effects of 4.5 mg co-dergocrine mesilate s.i.d. were in accordance with published results. Although differing slightly with respect to individual results 20 mg of nicergoline once daily showed the same efficacy on the whole.
OSPAR standard method and software for statistical analysis of beach litter data.

PubMed

Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit

2017-09-15

The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
Harnessing Multivariate Statistics for Ellipsoidal Data in Structural Geology

NASA Astrophysics Data System (ADS)

Roberts, N.; Davis, J. R.; Titus, S.; Tikoff, B.

2015-12-01

Most structural geology articles do not state significance levels, report confidence intervals, or perform regressions to find trends. This is, in part, because structural data tend to include directions, orientations, ellipsoids, and tensors, which are not treatable by elementary statistics. We describe a full procedural methodology for the statistical treatment of ellipsoidal data. We use a reconstructed dataset of deformed ooids in Maryland from Cloos (1947) to illustrate the process. Normalized ellipsoids have five degrees of freedom and can be represented by a second order tensor. This tensor can be permuted into a five dimensional vector that belongs to a vector space and can be treated with standard multivariate statistics. Cloos made several claims about the distribution of deformation in the South Mountain fold, Maryland, and we reexamine two particular claims using hypothesis testing: 1) octahedral shear strain increases towards the axial plane of the fold; 2) finite strain orientation varies systematically along the trend of the axial trace as it bends with the Appalachian orogen. We then test the null hypothesis that the southern segment of South Mountain is the same as the northern segment. This test illustrates the application of ellipsoidal statistics, which combine both orientation and shape. We report confidence intervals for each test, and graphically display our results with novel plots. This poster illustrates the importance of statistics in structural geology, especially when working with noisy or small datasets.
Sample Size and Statistical Conclusions from Tests of Fit to the Rasch Model According to the Rasch Unidimensional Measurement Model (Rumm) Program in Health Outcome Measurement.

PubMed

Hagell, Peter; Westergren, Albert

Sample size is a major factor in statistical null hypothesis testing, which is the basis for many approaches to testing Rasch model fit. Few sample size recommendations for testing fit to the Rasch model concern the Rasch Unidimensional Measurement Models (RUMM) software, which features chi-square and ANOVA/F-ratio based fit statistics, including Bonferroni and algebraic sample size adjustments. This paper explores the occurrence of Type I errors with RUMM fit statistics, and the effects of algebraic sample size adjustments. Data with simulated Rasch model fitting 25-item dichotomous scales and sample sizes ranging from N = 50 to N = 2500 were analysed with and without algebraically adjusted sample sizes. Results suggest the occurrence of Type I errors with N less then or equal to 500, and that Bonferroni correction as well as downward algebraic sample size adjustment are useful to avoid such errors, whereas upward adjustment of smaller samples falsely signal misfit. Our observations suggest that sample sizes around N = 250 to N = 500 may provide a good balance for the statistical interpretation of the RUMM fit statistics studied here with respect to Type I errors and under the assumption of Rasch model fit within the examined frame of reference (i.e., about 25 item parameters well targeted to the sample).
Statistical analysis of Thematic Mapper Simulator data for the geobotanical discrimination of rock types in southwest Oregon

NASA Technical Reports Server (NTRS)

Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.

1984-01-01

An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.
STATISTICAL ANALYSIS OF SNAP 10A THERMOELECTRIC CONVERTER ELEMENT PROCESS DEVELOPMENT VARIABLES

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fitch, S.H.; Morris, J.W.

1962-12-15

Statistical analysis, primarily analysis of variance, was applied to evaluate several factors involved in the development of suitable fabrication and processing techniques for the production of lead telluride thermoelectric elements for the SNAP 10A energy conversion system. The analysis methods are described as to their application for determining the effects of various processing steps, estabIishing the value of individual operations, and evaluating the significance of test results. The elimination of unnecessary or detrimental processing steps was accomplished and the number of required tests was substantially reduced by application of these statistical methods to the SNAP 10A production development effort. (auth)
Preparing for the first meeting with a statistician.

PubMed

De Muth, James E

2008-12-15

Practical statistical issues that should be considered when performing data collection and analysis are reviewed. The meeting with a statistician should take place early in the research development before any study data are collected. The process of statistical analysis involves establishing the research question, formulating a hypothesis, selecting an appropriate test, sampling correctly, collecting data, performing tests, and making decisions. Once the objectives are established, the researcher can determine the characteristics or demographics of the individuals required for the study, how to recruit volunteers, what type of data are needed to answer the research question(s), and the best methods for collecting the required information. There are two general types of statistics: descriptive and inferential. Presenting data in a more palatable format for the reader is called descriptive statistics. Inferential statistics involve making an inference or decision about a population based on results obtained from a sample of that population. In order for the results of a statistical test to be valid, the sample should be representative of the population from which it is drawn. When collecting information about volunteers, researchers should only collect information that is directly related to the study objectives. Important information that a statistician will require first is an understanding of the type of variables involved in the study and which variables can be controlled by researchers and which are beyond their control. Data can be presented in one of four different measurement scales: nominal, ordinal, interval, or ratio. Hypothesis testing involves two mutually exclusive and exhaustive statements related to the research question. Statisticians should not be replaced by computer software, and they should be consulted before any research data are collected. When preparing to meet with a statistician, the pharmacist researcher should be familiar with the steps of statistical analysis and consider several questions related to the study to be conducted.
Distinguishing Positive Selection From Neutral Evolution: Boosting the Performance of Summary Statistics

PubMed Central

Lin, Kao; Li, Haipeng; Schlötterer, Christian; Futschik, Andreas

2011-01-01

Summary statistics are widely used in population genetics, but they suffer from the drawback that no simple sufficient summary statistic exists, which captures all information required to distinguish different evolutionary hypotheses. Here, we apply boosting, a recent statistical method that combines simple classification rules to maximize their joint predictive performance. We show that our implementation of boosting has a high power to detect selective sweeps. Demographic events, such as bottlenecks, do not result in a large excess of false positives. A comparison to other neutrality tests shows that our boosting implementation performs well compared to other neutrality tests. Furthermore, we evaluated the relative contribution of different summary statistics to the identification of selection and found that for recent sweeps integrated haplotype homozygosity is very informative whereas older sweeps are better detected by Tajima's π. Overall, Watterson's θ was found to contribute the most information for distinguishing between bottlenecks and selection. PMID:21041556
Evaluation of Two Statistical Methods Provides Insights into the Complex Patterns of Alternative Polyadenylation Site Switching

PubMed Central

Li, Jie; Li, Rui; You, Leiming; Xu, Anlong; Fu, Yonggui; Huang, Shengfeng

2015-01-01

Switching between different alternative polyadenylation (APA) sites plays an important role in the fine tuning of gene expression. New technologies for the execution of 3’-end enriched RNA-seq allow genome-wide detection of the genes that exhibit significant APA site switching between different samples. Here, we show that the independence test gives better results than the linear trend test in detecting APA site-switching events. Further examination suggests that the discrepancy between these two statistical methods arises from complex APA site-switching events that cannot be represented by a simple change of average 3’-UTR length. In theory, the linear trend test is only effective in detecting these simple changes. We classify the switching events into four switching patterns: two simple patterns (3’-UTR shortening and lengthening) and two complex patterns. By comparing the results of the two statistical methods, we show that complex patterns account for 1/4 of all observed switching events that happen between normal and cancerous human breast cell lines. Because simple and complex switching patterns may convey different biological meanings, they merit separate study. We therefore propose to combine both the independence test and the linear trend test in practice. First, the independence test should be used to detect APA site switching; second, the linear trend test should be invoked to identify simple switching events; and third, those complex switching events that pass independence testing but fail linear trend testing can be identified. PMID:25875641
Non-Immunogenic Structurally and Biologically Intact Tissue Matrix Grafts for the Immediate Repair of Ballistic-Induced Vascular and Nerve Tissue Injury in Combat Casualty Care

DTIC Science & Technology

2005-07-01

as an access graft is addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the...addressed using statistical methods below. Graft consistency can be defined statistically as the variance associated with the sample of grafts tested in...measured using a refractometer (Brix % method). The equilibration data are shown in Graph 1. The results suggest the following equilibration scheme: 40% v/v
Deep learning for media analysis in defense scenariosan evaluation of an open source framework for object detection in intelligence related image sets

DTIC Science & Technology

2017-06-01

Training time statistics from Jones’ thesis. . . . . . . . . . . . . . 15 Table 2.2 Evaluation runtime statistics from Camp’s thesis for a single image. 17...Table 2.3 Training and evaluation runtime statistics from Sharpe’s thesis. . . 19 Table 2.4 Sharpe’s screenshot detector results for combinations of...training resources available and time required for each algorithm Jones [15] tested. Table 2.1. Training time statistics from Jones’ [15] thesis. Algorithm

Informal Statistics Help Desk

NASA Technical Reports Server (NTRS)

Young, M.; Koslovsky, M.; Schaefer, Caroline M.; Feiveson, A. H.

2017-01-01

Back by popular demand, the JSC Biostatistics Laboratory and LSAH statisticians are offering an opportunity to discuss your statistical challenges and needs. Take the opportunity to meet the individuals offering expert statistical support to the JSC community. Join us for an informal conversation about any questions you may have encountered with issues of experimental design, analysis, or data visualization. Get answers to common questions about sample size, repeated measures, statistical assumptions, missing data, multiple testing, time-to-event data, and when to trust the results of your analyses.
EVALUATION OF A NEW MEAN SCALED AND MOMENT ADJUSTED TEST STATISTIC FOR SEM.

PubMed

Tong, Xiaoxiao; Bentler, Peter M

2013-01-01

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and two well-known robust test statistics. A modification to the Satorra-Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the four test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies seven sample sizes and three distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ(2) test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra-Bentler scaled test statistic performed best overall, while the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.
Indirect zirconia-reinforced lithium silicate ceramic CAD/CAM restorations: Preliminary clinical results after 12 months.

PubMed

Zimmermann, Moritz; Koller, Christina; Mehl, Albert; Hickel, Reinhard

2017-01-01

No clinical data are available for the new computer-aided design/computer-assisted manufacture (CAD/CAM) material zirconia-reinforced lithium silicate (ZLS) ceramic. This study describes preliminary clinical results for indirect ZLS CAD/CAM restorations after 12 months. Indirect restorations were fabricated, using the CEREC method and intraoral scanning (CEREC Omnicam, CEREC MCXL). Sixty-seven restorations were seated adhesively (baseline). Sixty restorations were evaluated after 12 months (follow-up), using modified FDI criteria. Two groups were established, according to ZLS restorations' post-processing procedure prior to adhesive seating: group I (three-step polishing, n = 32) and group II (fire glazing, n = 28). Statistical analysis was performed with Mann-Whitney U test and Wilcoxon test (P < .05). The success rate of indirect ZLS CAD/CAM restorations after 12 months was 96.7%. Two restorations clinically failed as a result of bulk fracture (failure rate 3.3%). No statistically significant differences were found for baseline and follow-up criteria (Wilcoxon test, P > .05). Statistically significant differences were found for criteria surface gloss for group I and group II (Mann-Whitney U test, P < .05). This study demonstrates ZLS CAD/CAM restorations have a high clinical success rate after 12 months. A longer clinical evaluation period is necessary to draw further conclusions.
Evaluating Equating Results in the Non-Equivalent Groups with Anchor Test Design Using Equipercentile and Equity Criteria

ERIC Educational Resources Information Center

Duong, Minh Quang

2011-01-01

Testing programs often use multiple test forms of the same test to control item exposure and to ensure test security. Although test forms are constructed to be as similar as possible, they often differ. Test equating techniques are those statistical methods used to adjust scores obtained on different test forms of the same test so that they are…
Test Population Selection from Weibull-Based, Monte Carlo Simulations of Fatigue Life

NASA Technical Reports Server (NTRS)

Vlcek, Brian L.; Zaretsky, Erwin V.; Hendricks, Robert C.

2008-01-01

Fatigue life is probabilistic and not deterministic. Experimentally establishing the fatigue life of materials, components, and systems is both time consuming and costly. As a result, conclusions regarding fatigue life are often inferred from a statistically insufficient number of physical tests. A proposed methodology for comparing life results as a function of variability due to Weibull parameters, variability between successive trials, and variability due to size of the experimental population is presented. Using Monte Carlo simulation of randomly selected lives from a large Weibull distribution, the variation in the L10 fatigue life of aluminum alloy AL6061 rotating rod fatigue tests was determined as a function of population size. These results were compared to the L10 fatigue lives of small (10 each) populations from AL2024, AL7075 and AL6061. For aluminum alloy AL6061, a simple algebraic relationship was established for the upper and lower L10 fatigue life limits as a function of the number of specimens failed. For most engineering applications where less than 30 percent variability can be tolerated in the maximum and minimum values, at least 30 to 35 test samples are necessary. The variability of test results based on small sample sizes can be greater than actual differences, if any, that exists between materials and can result in erroneous conclusions. The fatigue life of AL2024 is statistically longer than AL6061 and AL7075. However, there is no statistical difference between the fatigue lives of AL6061 and AL7075 even though AL7075 had a fatigue life 30 percent greater than AL6061.
Test Population Selection from Weibull-Based, Monte Carlo Simulations of Fatigue Life

NASA Technical Reports Server (NTRS)

Vlcek, Brian L.; Zaretsky, Erwin V.; Hendricks, Robert C.

2012-01-01

Fatigue life is probabilistic and not deterministic. Experimentally establishing the fatigue life of materials, components, and systems is both time consuming and costly. As a result, conclusions regarding fatigue life are often inferred from a statistically insufficient number of physical tests. A proposed methodology for comparing life results as a function of variability due to Weibull parameters, variability between successive trials, and variability due to size of the experimental population is presented. Using Monte Carlo simulation of randomly selected lives from a large Weibull distribution, the variation in the L10 fatigue life of aluminum alloy AL6061 rotating rod fatigue tests was determined as a function of population size. These results were compared to the L10 fatigue lives of small (10 each) populations from AL2024, AL7075 and AL6061. For aluminum alloy AL6061, a simple algebraic relationship was established for the upper and lower L10 fatigue life limits as a function of the number of specimens failed. For most engineering applications where less than 30 percent variability can be tolerated in the maximum and minimum values, at least 30 to 35 test samples are necessary. The variability of test results based on small sample sizes can be greater than actual differences, if any, that exists between materials and can result in erroneous conclusions. The fatigue life of AL2024 is statistically longer than AL6061 and AL7075. However, there is no statistical difference between the fatigue lives of AL6061 and AL7075 even though AL7075 had a fatigue life 30 percent greater than AL6061.
metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis

PubMed Central

Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti

2016-01-01

Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Critical analysis of adsorption data statistically

NASA Astrophysics Data System (ADS)

Kaushal, Achla; Singh, S. K.

2017-10-01

Experimental data can be presented, computed, and critically analysed in a different way using statistics. A variety of statistical tests are used to make decisions about the significance and validity of the experimental data. In the present study, adsorption was carried out to remove zinc ions from contaminated aqueous solution using mango leaf powder. The experimental data was analysed statistically by hypothesis testing applying t test, paired t test and Chi-square test to (a) test the optimum value of the process pH, (b) verify the success of experiment and (c) study the effect of adsorbent dose in zinc ion removal from aqueous solutions. Comparison of calculated and tabulated values of t and χ 2 showed the results in favour of the data collected from the experiment and this has been shown on probability charts. K value for Langmuir isotherm was 0.8582 and m value for Freundlich adsorption isotherm obtained was 0.725, both are <1, indicating favourable isotherms. Karl Pearson's correlation coefficient values for Langmuir and Freundlich adsorption isotherms were obtained as 0.99 and 0.95 respectively, which show higher degree of correlation between the variables. This validates the data obtained for adsorption of zinc ions from the contaminated aqueous solution with the help of mango leaf powder.
Descriptive Statistics for Modern Test Score Distributions: Skewness, Kurtosis, Discreteness, and Ceiling Effects.

PubMed

Ho, Andrew D; Yu, Carol C

2015-06-01

Many statistical analyses benefit from the assumption that unconditional or conditional distributions are continuous and normal. More than 50 years ago in this journal, Lord and Cook chronicled departures from normality in educational tests, and Micerri similarly showed that the normality assumption is met rarely in educational and psychological practice. In this article, the authors extend these previous analyses to state-level educational test score distributions that are an increasingly common target of high-stakes analysis and interpretation. Among 504 scale-score and raw-score distributions from state testing programs from recent years, nonnormal distributions are common and are often associated with particular state programs. The authors explain how scaling procedures from item response theory lead to nonnormal distributions as well as unusual patterns of discreteness. The authors recommend that distributional descriptive statistics be calculated routinely to inform model selection for large-scale test score data, and they illustrate consequences of nonnormality using sensitivity studies that compare baseline results to those from normalized score scales.
Towards a web-based decision support tool for selecting appropriate statistical test in medical and biological sciences.

PubMed

Suner, Aslı; Karakülah, Gökhan; Dicle, Oğuz

2014-01-01

Statistical hypothesis testing is an essential component of biological and medical studies for making inferences and estimations from the collected data in the study; however, the misuse of statistical tests is widely common. In order to prevent possible errors in convenient statistical test selection, it is currently possible to consult available test selection algorithms developed for various purposes. However, the lack of an algorithm presenting the most common statistical tests used in biomedical research in a single flowchart causes several problems such as shifting users among the algorithms, poor decision support in test selection and lack of satisfaction of potential users. Herein, we demonstrated a unified flowchart; covers mostly used statistical tests in biomedical domain, to provide decision aid to non-statistician users while choosing the appropriate statistical test for testing their hypothesis. We also discuss some of the findings while we are integrating the flowcharts into each other to develop a single but more comprehensive decision algorithm.
Idaho Percentile Results for the 2015 and 2016 ISAT (SBAC) English Language Arts and Mathematics Tests in Grades 3-8 and 10

ERIC Educational Resources Information Center

Stoneberg, Bert D.

2016-01-01

Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests (ISAT). ISAT results have been have been reported almost exclusively as "percent proficient" statistics (i.e., the percentage of Idaho students who performed at the "A" level…
Idaho Percentile Results for the SBAC English Language Arts and Mathematics Tests, 2015-2017, Grades 3-8 and 10

ERIC Educational Resources Information Center

Stoneberg, Bert D.

2018-01-01

Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests. ISAT results have been reported almost exclusively as "percent proficient or above" statistics (i.e., the percentage of Idaho students who performed at the "A" level). This…
Kepler Planet Detection Metrics: Statistical Bootstrap Test

NASA Technical Reports Server (NTRS)

Jenkins, Jon M.; Burke, Christopher J.

2016-01-01

This document describes the data produced by the Statistical Bootstrap Test over the final three Threshold Crossing Event (TCE) deliveries to NExScI: SOC 9.1 (Q1Q16)1 (Tenenbaum et al. 2014), SOC 9.2 (Q1Q17) aka DR242 (Seader et al. 2015), and SOC 9.3 (Q1Q17) aka DR253 (Twicken et al. 2016). The last few years have seen significant improvements in the SOC science data processing pipeline, leading to higher quality light curves and more sensitive transit searches. The statistical bootstrap analysis results presented here and the numerical results archived at NASAs Exoplanet Science Institute (NExScI) bear witness to these software improvements. This document attempts to introduce and describe the main features and differences between these three data sets as a consequence of the software changes.
Test of spectral/spatial classifier

NASA Technical Reports Server (NTRS)

Landgrebe, D. A. (Principal Investigator); Kast, J. L.; Davis, B. J.

1977-01-01

The author has identified the following significant results. The supervised ECHO processor (which utilizes class statistics for object identification) successfully exploits the redundancy of states characteristic of sampled imagery of ground scenes to achieve better classification accuracy, reduce the number of classifications required, and reduce the variability of classification results. The nonsupervised ECHO processor (which identifies objects without the benefit of class statistics) successfully reduces the number of classifications required and the variability of the classification results.
Quantification and Statistical Analysis Methods for Vessel Wall Components from Stained Images with Masson's Trichrome

PubMed Central

Hernández-Morera, Pablo; Castaño-González, Irene; Travieso-González, Carlos M.; Mompeó-Corredera, Blanca; Ortega-Santana, Francisco

2016-01-01

Purpose To develop a digital image processing method to quantify structural components (smooth muscle fibers and extracellular matrix) in the vessel wall stained with Masson’s trichrome, and a statistical method suitable for small sample sizes to analyze the results previously obtained. Methods The quantification method comprises two stages. The pre-processing stage improves tissue image appearance and the vessel wall area is delimited. In the feature extraction stage, the vessel wall components are segmented by grouping pixels with a similar color. The area of each component is calculated by normalizing the number of pixels of each group by the vessel wall area. Statistical analyses are implemented by permutation tests, based on resampling without replacement from the set of the observed data to obtain a sampling distribution of an estimator. The implementation can be parallelized on a multicore machine to reduce execution time. Results The methods have been tested on 48 vessel wall samples of the internal saphenous vein stained with Masson’s trichrome. The results show that the segmented areas are consistent with the perception of a team of doctors and demonstrate good correlation between the expert judgments and the measured parameters for evaluating vessel wall changes. Conclusion The proposed methodology offers a powerful tool to quantify some components of the vessel wall. It is more objective, sensitive and accurate than the biochemical and qualitative methods traditionally used. The permutation tests are suitable statistical techniques to analyze the numerical measurements obtained when the underlying assumptions of the other statistical techniques are not met. PMID:26761643
Improving validation methods for molecular diagnostics: application of Bland-Altman, Deming and simple linear regression analyses in assay comparison and evaluation for next-generation sequencing

PubMed Central

Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L

2018-01-01

Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
Constructing three emotion knowledge tests from the invariant measurement approach

PubMed Central

Prieto, Gerardo; Burin, Debora I.

2017-01-01

Background Psychological constructionist models like the Conceptual Act Theory (CAT) postulate that complex states such as emotions are composed of basic psychological ingredients that are more clearly respected by the brain than basic emotions. The objective of this study was the construction and initial validation of Emotion Knowledge measures from the CAT frame by means of an invariant measurement approach, the Rasch Model (RM). Psychological distance theory was used to inform item generation. Methods Three EK tests—emotion vocabulary (EV), close emotional situations (CES) and far emotional situations (FES)—were constructed and tested with the RM in a community sample of 100 females and 100 males (age range: 18–65), both separately and conjointly. Results It was corroborated that data-RM fit was sufficient. Then, the effect of type of test and emotion on Rasch-modelled item difficulty was tested. Significant effects of emotion on EK item difficulty were found, but the only statistically significant difference was that between “happiness” and the remaining emotions; neither type of test, nor interaction effects on EK item difficulty were statistically significant. The testing of gender differences was carried out after corroborating that differential item functioning (DIF) would not be a plausible alternative hypothesis for the results. No statistically significant sex-related differences were found out in EV, CES, FES, or total EK. However, the sign of d indicate that female participants were consistently better than male ones, a result that will be of interest for future meta-analyses. Discussion The three EK tests are ready to be used as components of a higher-level measurement process. PMID:28929013
Pathway analysis with next-generation sequencing data.

PubMed

Zhao, Jinying; Zhu, Yun; Boerwinkle, Eric; Xiong, Momiao

2015-04-01

Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.
A statistical investigation of z test and ROC curve on seismo-ionospheric anomalies in TEC associated earthquakes in Taiwan during 1999-2014

NASA Astrophysics Data System (ADS)

Shih, A. L.; Liu, J. Y. G.

2015-12-01

A median-based method and a z test are employed to find characteristics of seismo-ionospheric precursor (SIP) of the total electron content (TEC) in global ionosphere map (GIM) associated with 129 M≥5.5 earthquakes in Taiwan during 1999-2014. Results show that both negative and positive anomalies in the GIM TEC with the statistical significance of the z test appear few days before the earthquakes. The receiver operating characteristic (ROC) curve is further applied to see whether the SIPs exist in Taiwan.
40 CFR 80.47 - Performance-based Analytical Test Method Approach.

Code of Federal Regulations, 2014 CFR

2014-07-01

... chemistry and statistics, or at least a bachelor's degree in chemical engineering, from an accredited... be compensated for any known chemical interferences using good laboratory practices. (3) The test... section, individual test results shall be compensated for any known chemical interferences using good...

The Use of Peer Tutoring to Improve the Passing Rates in Mathematics Placement Exams of Engineering Students: A Success Story

ERIC Educational Resources Information Center

García, Rolando; Morales, Juan C.; Rivera, Gloribel

2014-01-01

This paper describes a highly successful peer tutoring program that has resulted in an improvement in the passing rates of mathematics placement exams from 16% to 42%, on average. Statistical analyses were conducted using a Chi-Squared (?[superscript 2]) test for independence and the results were statistically significant (p-value much less than…
AGR-1 Thermocouple Data Analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jeff Einerson

2012-05-01

This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods tomore » further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of simulation results (Chapter 3). The statistics-based simulation-aided experimental control procedure described for the future AGR tests is developed and demonstrated in Chapter 4. The procedure for controlling the target fuel temperature (capsule peak or average) is based on regression functions of thermocouple readings and other relevant parameters and accounting for possible changes in both physical and thermal conditions and in instrument performance.« less
Different Tests for a Difference: How Do We Do Research?

ERIC Educational Resources Information Center

Drummond, Gordon B.; Vowler, Sarah L.

2012-01-01

Most biological scientists conduct experiments to look for effects, and test the results statistically. One of the commonly used test is Student's t test. However, this test concentrates on a very limited question. The authors assume that there is no effect in the experiment, and then estimate the possibility that they could have obtained these…
Statistics For Success Statistical Analysis Of Student Data Is A Lot Easier Than You Think And More Useful Than You Imagine.

ERIC Educational Resources Information Center

Kadel, Robert

2004-01-01

To her surprise, Ms. Logan had just conducted a statistical analysis of her 10th grade biology students' quiz scores. The results indicated that she needed to reinforce mitosis before the students took the high-school proficiency test in three weeks, as required by the state. "Oh! That's easy!" She exclaimed. Teachers like Ms. Logan are…
The Web as an educational tool for/in learning/teaching bioinformatics statistics.

PubMed

Oliver, J; Pisano, M E; Alonso, T; Roca, P

2005-12-01

Statistics provides essential tool in Bioinformatics to interpret the results of a database search or for the management of enormous amounts of information provided from genomics, proteomics and metabolomics. The goal of this project was the development of a software tool that would be as simple as possible to demonstrate the use of the Bioinformatics statistics. Computer Simulation Methods (CSMs) developed using Microsoft Excel were chosen for their broad range of applications, immediate and easy formula calculation, immediate testing and easy graphics representation, and of general use and acceptance by the scientific community. The result of these endeavours is a set of utilities which can be accessed from the following URL: http://gmein.uib.es/bioinformatica/statistics. When tested on students with previous coursework with traditional statistical teaching methods, the general opinion/overall consensus was that Web-based instruction had numerous advantages, but traditional methods with manual calculations were also needed for their theory and practice. Once having mastered the basic statistical formulas, Excel spreadsheets and graphics were shown to be very useful for trying many parameters in a rapid fashion without having to perform tedious calculations. CSMs will be of great importance for the formation of the students and professionals in the field of bioinformatics, and for upcoming applications of self-learning and continuous formation.
Power, effects, confidence, and significance: an investigation of statistical practices in nursing research.

PubMed

Gaskin, Cadeyrn J; Happell, Brenda

2014-05-01

To (a) assess the statistical power of nursing research to detect small, medium, and large effect sizes; (b) estimate the experiment-wise Type I error rate in these studies; and (c) assess the extent to which (i) a priori power analyses, (ii) effect sizes (and interpretations thereof), and (iii) confidence intervals were reported. Statistical review. Papers published in the 2011 volumes of the 10 highest ranked nursing journals, based on their 5-year impact factors. Papers were assessed for statistical power, control of experiment-wise Type I error, reporting of a priori power analyses, reporting and interpretation of effect sizes, and reporting of confidence intervals. The analyses were based on 333 papers, from which 10,337 inferential statistics were identified. The median power to detect small, medium, and large effect sizes was .40 (interquartile range [IQR]=.24-.71), .98 (IQR=.85-1.00), and 1.00 (IQR=1.00-1.00), respectively. The median experiment-wise Type I error rate was .54 (IQR=.26-.80). A priori power analyses were reported in 28% of papers. Effect sizes were routinely reported for Spearman's rank correlations (100% of papers in which this test was used), Poisson regressions (100%), odds ratios (100%), Kendall's tau correlations (100%), Pearson's correlations (99%), logistic regressions (98%), structural equation modelling/confirmatory factor analyses/path analyses (97%), and linear regressions (83%), but were reported less often for two-proportion z tests (50%), analyses of variance/analyses of covariance/multivariate analyses of variance (18%), t tests (8%), Wilcoxon's tests (8%), Chi-squared tests (8%), and Fisher's exact tests (7%), and not reported for sign tests, Friedman's tests, McNemar's tests, multi-level models, and Kruskal-Wallis tests. Effect sizes were infrequently interpreted. Confidence intervals were reported in 28% of papers. The use, reporting, and interpretation of inferential statistics in nursing research need substantial improvement. Most importantly, researchers should abandon the misleading practice of interpreting the results from inferential tests based solely on whether they are statistically significant (or not) and, instead, focus on reporting and interpreting effect sizes, confidence intervals, and significance levels. Nursing researchers also need to conduct and report a priori power analyses, and to address the issue of Type I experiment-wise error inflation in their studies. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
The choice of statistical methods for comparisons of dosimetric data in radiotherapy.

PubMed

Chaikh, Abdulhamid; Giraud, Jean-Yves; Perrin, Emmanuel; Bresciani, Jean-Pierre; Balosso, Jacques

2014-09-18

Novel irradiation techniques are continuously introduced in radiotherapy to optimize the accuracy, the security and the clinical outcome of treatments. These changes could raise the question of discontinuity in dosimetric presentation and the subsequent need for practice adjustments in case of significant modifications. This study proposes a comprehensive approach to compare different techniques and tests whether their respective dose calculation algorithms give rise to statistically significant differences in the treatment doses for the patient. Statistical investigation principles are presented in the framework of a clinical example based on 62 fields of radiotherapy for lung cancer. The delivered doses in monitor units were calculated using three different dose calculation methods: the reference method accounts the dose without tissues density corrections using Pencil Beam Convolution (PBC) algorithm, whereas new methods calculate the dose with tissues density correction for 1D and 3D using Modified Batho (MB) method and Equivalent Tissue air ratio (ETAR) method, respectively. The normality of the data and the homogeneity of variance between groups were tested using Shapiro-Wilks and Levene test, respectively, then non-parametric statistical tests were performed. Specifically, the dose means estimated by the different calculation methods were compared using Friedman's test and Wilcoxon signed-rank test. In addition, the correlation between the doses calculated by the three methods was assessed using Spearman's rank and Kendall's rank tests. The Friedman's test showed a significant effect on the calculation method for the delivered dose of lung cancer patients (p <0.001). The density correction methods yielded to lower doses as compared to PBC by on average (-5 ± 4.4 SD) for MB and (-4.7 ± 5 SD) for ETAR. Post-hoc Wilcoxon signed-rank test of paired comparisons indicated that the delivered dose was significantly reduced using density-corrected methods as compared to the reference method. Spearman's and Kendall's rank tests indicated a positive correlation between the doses calculated with the different methods. This paper illustrates and justifies the use of statistical tests and graphical representations for dosimetric comparisons in radiotherapy. The statistical analysis shows the significance of dose differences resulting from two or more techniques in radiotherapy.
Multi-Reader ROC studies with Split-Plot Designs: A Comparison of Statistical Methods

PubMed Central

Obuchowski, Nancy A.; Gallas, Brandon D.; Hillis, Stephen L.

2012-01-01

Rationale and Objectives Multi-reader imaging trials often use a factorial design, where study patients undergo testing with all imaging modalities and readers interpret the results of all tests for all patients. A drawback of the design is the large number of interpretations required of each reader. Split-plot designs have been proposed as an alternative, in which one or a subset of readers interprets all images of a sample of patients, while other readers interpret the images of other samples of patients. In this paper we compare three methods of analysis for the split-plot design. Materials and Methods Three statistical methods are presented: Obuchowski-Rockette method modified for the split-plot design, a newly proposed marginal-mean ANOVA approach, and an extension of the three-sample U-statistic method. A simulation study using the Roe-Metz model was performed to compare the type I error rate, power and confidence interval coverage of the three test statistics. Results The type I error rates for all three methods are close to the nominal level but tend to be slightly conservative. The statistical power is nearly identical for the three methods. The coverage of 95% CIs fall close to the nominal coverage for small and large sample sizes. Conclusions The split-plot MRMC study design can be statistically efficient compared with the factorial design, reducing the number of interpretations required per reader. Three methods of analysis, shown to have nominal type I error rate, similar power, and nominal CI coverage, are available for this study design. PMID:23122570
An accurate test for homogeneity of odds ratios based on Cochran's Q-statistic.

PubMed

Kulinskaya, Elena; Dollinger, Michael B

2015-06-10

A frequently used statistic for testing homogeneity in a meta-analysis of K independent studies is Cochran's Q. For a standard test of homogeneity the Q statistic is referred to a chi-square distribution with K-1 degrees of freedom. For the situation in which the effects of the studies are logarithms of odds ratios, the chi-square distribution is much too conservative for moderate size studies, although it may be asymptotically correct as the individual studies become large. Using a mixture of theoretical results and simulations, we provide formulas to estimate the shape and scale parameters of a gamma distribution to fit the distribution of Q. Simulation studies show that the gamma distribution is a good approximation to the distribution for Q. Use of the gamma distribution instead of the chi-square distribution for Q should eliminate inaccurate inferences in assessing homogeneity in a meta-analysis. (A computer program for implementing this test is provided.) This hypothesis test is competitive with the Breslow-Day test both in accuracy of level and in power.
Approximate sample size formulas for the two-sample trimmed mean test with unequal variances.

PubMed

Luh, Wei-Ming; Guo, Jiin-Huarng

2007-05-01

Yuen's two-sample trimmed mean test statistic is one of the most robust methods to apply when variances are heterogeneous. The present study develops formulas for the sample size required for the test. The formulas are applicable for the cases of unequal variances, non-normality and unequal sample sizes. Given the specified alpha and the power (1-beta), the minimum sample size needed by the proposed formulas under various conditions is less than is given by the conventional formulas. Moreover, given a specified size of sample calculated by the proposed formulas, simulation results show that Yuen's test can achieve statistical power which is generally superior to that of the approximate t test. A numerical example is provided.
Statistical analysis of time transfer data from Timation 2. [US Naval Observatory and Australia

NASA Technical Reports Server (NTRS)

Luck, J. M.; Morgan, P.

1974-01-01

Between July 1973 and January 1974, three time transfer experiments using the Timation 2 satellite were conducted to measure time differences between the U.S. Naval Observatory and Australia. Statistical tests showed that the results are unaffected by the satellite's position with respect to the sunrise/sunset line or by its closest approach azimuth at the Australian station. Further tests revealed that forward predictions of time scale differences, based on the measurements, can be made with high confidence.
Probability of identification: a statistical model for the validation of qualitative botanical identification methods.

PubMed

LaBudde, Robert A; Harnly, James M

2012-01-01

A qualitative botanical identification method (BIM) is an analytical procedure that returns a binary result (1 = Identified, 0 = Not Identified). A BIM may be used by a buyer, manufacturer, or regulator to determine whether a botanical material being tested is the same as the target (desired) material, or whether it contains excessive nontarget (undesirable) material. The report describes the development and validation of studies for a BIM based on the proportion of replicates identified, or probability of identification (POI), as the basic observed statistic. The statistical procedures proposed for data analysis follow closely those of the probability of detection, and harmonize the statistical concepts and parameters between quantitative and qualitative method validation. Use of POI statistics also harmonizes statistical concepts for botanical, microbiological, toxin, and other analyte identification methods that produce binary results. The POI statistical model provides a tool for graphical representation of response curves for qualitative methods, reporting of descriptive statistics, and application of performance requirements. Single collaborator and multicollaborative study examples are given.
The epistemological status of general circulation models

NASA Astrophysics Data System (ADS)

Loehle, Craig

2018-03-01

Forecasts of both likely anthropogenic effects on climate and consequent effects on nature and society are based on large, complex software tools called general circulation models (GCMs). Forecasts generated by GCMs have been used extensively in policy decisions related to climate change. However, the relation between underlying physical theories and results produced by GCMs is unclear. In the case of GCMs, many discretizations and approximations are made, and simulating Earth system processes is far from simple and currently leads to some results with unknown energy balance implications. Statistical testing of GCM forecasts for degree of agreement with data would facilitate assessment of fitness for use. If model results need to be put on an anomaly basis due to model bias, then both visual and quantitative measures of model fit depend strongly on the reference period used for normalization, making testing problematic. Epistemology is here applied to problems of statistical inference during testing, the relationship between the underlying physics and the models, the epistemic meaning of ensemble statistics, problems of spatial and temporal scale, the existence or not of an unforced null for climate fluctuations, the meaning of existing uncertainty estimates, and other issues. Rigorous reasoning entails carefully quantifying levels of uncertainty.
Accounting for Unexpected Test Responses through Examinees' and Their Teachers' Explanations

ERIC Educational Resources Information Center

Petridou, Alexandra; Williams, Julian

2010-01-01

Researchers have developed indices to identify persons whose test results "misfit" and are considered statistically "aberrant" or "unexpected" and whose measures are consequently potentially invalid, drawing the test's validity into question. This study draws on interviews of pupils and their teachers, using a sample…
Performance statistics of the FORTRAN 4 /H/ library for the IBM system/360

NASA Technical Reports Server (NTRS)

Clark, N. A.; Cody, W. J., Jr.; Hillstrom, K. E.; Thieleker, E. A.

1969-01-01

Test procedures and results for accuracy and timing tests of the basic IBM 360/50 FORTRAN 4 /H/ subroutine library are reported. The testing was undertaken to verify performance capability and as a prelude to providing some replacement routines of improved performance.
SOCR: Statistics Online Computational Resource

PubMed Central

Dinov, Ivo D.

2011-01-01

The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
An evaluation of grease type ball bearing lubricants operating in various environments

NASA Technical Reports Server (NTRS)

Mcmurtrey, E. L.

1981-01-01

Because many future spacecraft or space stations will require mechanisms to operate for long periods of time in environments which are adverse to most bearing lubricants, a series of tests is continuing to evaluate 38 grease type lubricants in R-4 size bearings in five different environments for a 1 year period. Four repetitions of each test are made to provide statistical samples. These tests were used to select four lubricants for 5 year tests in selected environments with five repetitions of each test for statistical samples. At the present time, 100 test sets are completed and 22 test sets are underway. Three 5 year tests were started in (1) continuous operation and (2) start-stop operation, with both in vacuum at ambient temperatures, and (3) continuous operation at 93.3 C. In the 1 year tests the best results to date in all environments were obtained with a high viscosity index perfluoroalkylpolyether (PFPE) grease.
Appraisal of within- and between-laboratory reproducibility of non-radioisotopic local lymph node assay using flow cytometry, LLNA:BrdU-FCM: comparison of OECD TG429 performance standard and statistical evaluation.

PubMed

Yang, Hyeri; Na, Jihye; Jang, Won-Hee; Jung, Mi-Sook; Jeon, Jun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Lim, Kyung-Min; Bae, SeungJin

2015-05-05

Mouse local lymph node assay (LLNA, OECD TG429) is an alternative test replacing conventional guinea pig tests (OECD TG406) for the skin sensitization test but the use of a radioisotopic agent, (3)H-thymidine, deters its active dissemination. New non-radioisotopic LLNA, LLNA:BrdU-FCM employs a non-radioisotopic analog, 5-bromo-2'-deoxyuridine (BrdU) and flow cytometry. For an analogous method, OECD TG429 performance standard (PS) advises that two reference compounds be tested repeatedly and ECt(threshold) values obtained must fall within acceptable ranges to prove within- and between-laboratory reproducibility. However, this criteria is somewhat arbitrary and sample size of ECt is less than 5, raising concerns about insufficient reliability. Here, we explored various statistical methods to evaluate the reproducibility of LLNA:BrdU-FCM with stimulation index (SI), the raw data for ECt calculation, produced from 3 laboratories. Descriptive statistics along with graphical representation of SI was presented. For inferential statistics, parametric and non-parametric methods were applied to test the reproducibility of SI of a concurrent positive control and the robustness of results were investigated. Descriptive statistics and graphical representation of SI alone could illustrate the within- and between-laboratory reproducibility. Inferential statistics employing parametric and nonparametric methods drew similar conclusion. While all labs passed within- and between-laboratory reproducibility criteria given by OECD TG429 PS based on ECt values, statistical evaluation based on SI values showed that only two labs succeeded in achieving within-laboratory reproducibility. For those two labs that satisfied the within-lab reproducibility, between-laboratory reproducibility could be also attained based on inferential as well as descriptive statistics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
[Again review of research design and statistical methods of Chinese Journal of Cardiology].

PubMed

Kong, Qun-yu; Yu, Jin-ming; Jia, Gong-xian; Lin, Fan-li

2012-11-01

To re-evaluate and compare the research design and the use of statistical methods in Chinese Journal of Cardiology. Summary the research design and statistical methods in all of the original papers in Chinese Journal of Cardiology all over the year of 2011, and compared the result with the evaluation of 2008. (1) There is no difference in the distribution of the design of researches of between the two volumes. Compared with the early volume, the use of survival regression and non-parameter test are increased, while decreased in the proportion of articles with no statistical analysis. (2) The proportions of articles in the later volume are significant lower than the former, such as 6(4%) with flaws in designs, 5(3%) with flaws in the expressions, 9(5%) with the incomplete of analysis. (3) The rate of correction of variance analysis has been increased, so as the multi-group comparisons and the test of normality. The error rate of usage has been decreased form 17% to 25% without significance in statistics due to the ignorance of the test of homogeneity of variance. Many improvements showed in Chinese Journal of Cardiology such as the regulation of the design and statistics. The homogeneity of variance should be paid more attention in the further application.
A MULTIPLE TESTING OF THE ABC METHOD AND THE DEVELOPMENT OF A SECOND-GENERATION MODEL. PART II, TEST RESULTS AND AN ANALYSIS OF RECALL RATIO.

ERIC Educational Resources Information Center

ALTMANN, BERTHOLD

AFTER A BRIEF SUMMARY OF THE TEST PROGRAM (DESCRIBED MORE FULLY IN LI 000 318), THE STATISTICAL RESULTS TABULATED AS OVERALL "ABC (APPROACH BY CONCEPT)-RELEVANCE RATIOS" AND "ABC-RECALL FIGURES" ARE PRESENTED AND REVIEWED. AN ABSTRACT MODEL DEVELOPED IN ACCORDANCE WITH MAX WEBER'S "IDEALTYPUS" ("DIE OBJEKTIVITAET…

Endurance and failure characteristics of modified Vasco X-2, CBS 600 and AISI 9310 spur gears. [aircraft construction materials

NASA Technical Reports Server (NTRS)

Townsend, D. P.; Zaretsky, E. V.

1980-01-01

Gear endurance tests and rolling-element fatigue tests were conducted to compare the performance of spur gears made from AISI 9310, CBS 600 and modified Vasco X-2 and to compare the pitting fatigue lives of these three materials. Gears manufactured from CBS 600 exhibited lives longer than those manufactured from AISI 9310. However, rolling-element fatigue tests resulted in statistically equivalent lives. Modified Vasco X-2 exhibited statistically equivalent lives to AISI 9310. CBS 600 and modified Vasco X-2 gears exhibited the potential of tooth fracture occurring at a tooth surface fatigue pit. Case carburization of all gear surfaces for the modified Vasco X-2 gears results in fracture at the tips of the gears.
An Extension of the Chi-Square Procedure for Non-NORMAL Statistics, with Application to Solar Neutrino Data

NASA Astrophysics Data System (ADS)

Sturrock, P. A.

2008-01-01

Using the chi-square statistic, one may conveniently test whether a series of measurements of a variable are consistent with a constant value. However, that test is predicated on the assumption that the appropriate probability distribution function (pdf) is normal in form. This requirement is usually not satisfied by experimental measurements of the solar neutrino flux. This article presents an extension of the chi-square procedure that is valid for any form of the pdf. This procedure is applied to the GALLEX-GNO dataset, and it is shown that the results are in good agreement with the results of Monte Carlo simulations. Whereas application of the standard chi-square test to symmetrized data yields evidence significant at the 1% level for variability of the solar neutrino flux, application of the extended chi-square test to the unsymmetrized data yields only weak evidence (significant at the 4% level) of variability.
Statistical Analysis of Zebrafish Locomotor Response.

PubMed

Liu, Yiwen; Carmer, Robert; Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai

2015-01-01

Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling's T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling's T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure.
Statistical Analysis of Zebrafish Locomotor Response

PubMed Central

Zhang, Gaonan; Venkatraman, Prahatha; Brown, Skye Ashton; Pang, Chi-Pui; Zhang, Mingzhi; Ma, Ping; Leung, Yuk Fai

2015-01-01

Zebrafish larvae display rich locomotor behaviour upon external stimulation. The movement can be simultaneously tracked from many larvae arranged in multi-well plates. The resulting time-series locomotor data have been used to reveal new insights into neurobiology and pharmacology. However, the data are of large scale, and the corresponding locomotor behavior is affected by multiple factors. These issues pose a statistical challenge for comparing larval activities. To address this gap, this study has analyzed a visually-driven locomotor behaviour named the visual motor response (VMR) by the Hotelling’s T-squared test. This test is congruent with comparing locomotor profiles from a time period. Different wild-type (WT) strains were compared using the test, which shows that they responded differently to light change at different developmental stages. The performance of this test was evaluated by a power analysis, which shows that the test was sensitive for detecting differences between experimental groups with sample numbers that were commonly used in various studies. In addition, this study investigated the effects of various factors that might affect the VMR by multivariate analysis of variance (MANOVA). The results indicate that the larval activity was generally affected by stage, light stimulus, their interaction, and location in the plate. Nonetheless, different factors affected larval activity differently over time, as indicated by a dynamical analysis of the activity at each second. Intriguingly, this analysis also shows that biological and technical repeats had negligible effect on larval activity. This finding is consistent with that from the Hotelling’s T-squared test, and suggests that experimental repeats can be combined to enhance statistical power. Together, these investigations have established a statistical framework for analyzing VMR data, a framework that should be generally applicable to other locomotor data with similar structure. PMID:26437184
False-positive results in pharmacoepidemiology and pharmacovigilance.

PubMed

Bezin, Julien; Bosco-Levy, Pauline; Pariente, Antoine

2017-09-01

False-positive constitute an important issue in scientific research. In the domain of drug evaluation, it affects all phases of drug development and assessment, from the very early preclinical studies to the late post-marketing evaluations. The core concern associated with this false-positive is the lack of replicability of the results. Aside from fraud or misconducts, false-positive is often envisioned from the statistical angle, which considers them as a price to pay for type I error in statistical testing, and its inflation in the context of multiple testing. If envisioning this problematic in the context of pharmacoepidemiology and pharmacovigilance however, that both evaluate drugs in an observational settings, information brought by statistical testing and the significance of such should only be considered as additional to the estimates provided and their confidence interval, in a context where differences have to be a clinically meaningful upon everything, and the results appear robust to the biases likely to have affected the studies. In the following article, we consequently illustrate these biases and their consequences in generating false-positive results, through studies and associations between drug use and health outcomes that have been widely disputed. Copyright © 2017 Société française de pharmacologie et de thérapeutique. Published by Elsevier Masson SAS. All rights reserved.
A Framework for Establishing Standard Reference Scale of Texture by Multivariate Statistical Analysis Based on Instrumental Measurement and Sensory Evaluation.

PubMed

Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye

2016-01-13

A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
The effects of compensatory workplace exercises to reduce work-related stress and musculoskeletal pain1

PubMed Central

de Freitas-Swerts, Fabiana Cristina Taubert; Robazzi, Maria Lúcia do Carmo Cruz

2014-01-01

OBJECTIVES: to assess the effect of a compensatory workplace exercise program on workers with the purpose of reducing work-related stress and musculoskeletal pain. METHOD: quasi-experimental research with quantitative analysis of the data, involving 30 administrative workers from a Higher Education Public Institution. For data collection, questionnaires were used to characterize the workers, as well as the Workplace Stress Scale and the Corlett Diagram. The research took place in three stages: first: pre-test with the application of the questionnaires to the subjects; second: Workplace Exercise taking place twice a week, for 15 minutes, during a period of 10 weeks; third: post-test in which the subjects answered the questionnaires again. For data analysis, the descriptive statistics and non-parametric statistics were used through the Wilcoxon Test. RESULTS: work-related stress was present in the assessed workers, but there was no statistically significant reduction in the scores after undergoing Workplace Exercise. However, there was a statistically significant pain reduction in the neck, cervical, upper, middle and lower back, right thigh, left leg, right ankle and feet. CONCLUSION: the Workplace Exercise promoted a significant pain reduction in the spine, but did not result in a significant reduction in the levels of work-related stress. PMID:25296147
Implementing statistical analysis in multi-channel acoustic impact-echo testing of concrete bridge decks: Determining thresholds for delamination detection

NASA Astrophysics Data System (ADS)

Hendricks, Lorin; Spencer Guthrie, W.; Mazzeo, Brian

2018-04-01

An automated acoustic impact-echo testing device with seven channels has been developed for faster surveying of bridge decks. Due to potential variations in bridge deck overlay thickness, varying conditions between testing passes, and occasional imprecise equipment calibrations, a method that can account for variations in deck properties and testing conditions was necessary to correctly interpret the acoustic data. A new methodology involving statistical analyses was therefore developed. After acoustic impact-echo data are collected and analyzed, the results are normalized by the median for each channel, a Gaussian distribution is fit to the histogram of the data, and the Kullback-Leibler divergence test or Otsu's method is then used to determine the optimum threshold for differentiating between intact and delaminated concrete. The new methodology was successfully applied to individual channels of previously unusable acoustic impact-echo data obtained from a three-lane interstate bridge deck surfaced with a polymer overlay, and the resulting delamination map compared very favorably with the results of a manual deck sounding survey.
The compartment bag test (CBT) for enumerating fecal indicator bacteria: Basis for design and interpretation of results.

PubMed

Gronewold, Andrew D; Sobsey, Mark D; McMahan, Lanakila

2017-06-01

For the past several years, the compartment bag test (CBT) has been employed in water quality monitoring and public health protection around the world. To date, however, the statistical basis for the design and recommended procedures for enumerating fecal indicator bacteria (FIB) concentrations from CBT results have not been formally documented. Here, we provide that documentation following protocols for communicating the evolution of similar water quality testing procedures. We begin with an overview of the statistical theory behind the CBT, followed by a description of how that theory was applied to determine an optimal CBT design. We then provide recommendations for interpreting CBT results, including procedures for estimating quantiles of the FIB concentration probability distribution, and the confidence of compliance with recognized water quality guidelines. We synthesize these values in custom user-oriented 'look-up' tables similar to those developed for other FIB water quality testing methods. Modified versions of our tables are currently distributed commercially as part of the CBT testing kit. Published by Elsevier B.V.
Acceptability of HIV/AIDS testing among pre-marital couples in Iran (2012)

PubMed Central

Ayatollahi, Jamshid; Nasab Sarab, Mohammad Ali Bagheri; Sharifi, Mohammad Reza; Shahcheraghi, Seyed Hossein

2014-01-01

Background: Human immunodeficiency virus (HIV)/acquired immune deficiency syndrome (AIDS) is a lifestyle-related disease. This disease is transmitted through unprotected sex, contaminated needles, infected blood transfusion and from mother to child during pregnancy and delivery. Prevention of infection with HIV, mainly through safe sex and needle exchange programmes is a solution to prevent the spread of the disease. Knowledge about HIV state helps to prevent and subsequently reduce the harm to the later generation. The purpose of this study was to assess the willingness rate of couples referred to the family regulation pre-marital counselling centre for performing HIV test before marriage in Yazd. Patients and Methods: In this descriptive study, a simple random sampling was done among people referred to Akbari clinic. The couples were 1000 men and 1000 women referred to the premarital counselling centre for pre-marital HIV testing in Yazd in the year 2012. They were in situations of pregnancy, delivery or nursing and milking. The data were analyzed using Statistical Package for the Social Sciences (SPSS) software and chi-square statistical test. Results: There was a significant statistical difference between the age groups about willingness for HIV testing before marriage (P < 0.001) and also positive comments about HIV testing in asymptomatic individuals (P < 0.001). This study also proved a significant statistical difference between the two gender groups about willingness to marry after HIV positive test of their wives. Conclusion: The willingness rate of couples to undergo HIV testing before marriage was significant. Therefore, HIV testing before marriage as a routine test was suggested. PMID:25114363
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bihn T. Pham; Jeffrey J. Einerson

2010-06-01

This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less
New robust statistical procedures for the polytomous logistic regression models.

PubMed

Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

2018-05-17

This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.
Single-variant and multi-variant trend tests for genetic association with next-generation sequencing that are robust to sequencing error.

PubMed

Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Alejandro Q; Musolf, Anthony; Matise, Tara C; Finch, Stephen J; Gordon, Derek

2012-01-01

As with any new technology, next-generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to those data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single-variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p value, no matter how many loci. Copyright © 2013 S. Karger AG, Basel.
Single variant and multi-variant trend tests for genetic association with next generation sequencing that are robust to sequencing error

PubMed Central

Kim, Wonkuk; Londono, Douglas; Zhou, Lisheng; Xing, Jinchuan; Nato, Andrew; Musolf, Anthony; Matise, Tara C.; Finch, Stephen J.; Gordon, Derek

2013-01-01

As with any new technology, next generation sequencing (NGS) has potential advantages and potential challenges. One advantage is the identification of multiple causal variants for disease that might otherwise be missed by SNP-chip technology. One potential challenge is misclassification error (as with any emerging technology) and the issue of power loss due to multiple testing. Here, we develop an extension of the linear trend test for association that incorporates differential misclassification error and may be applied to any number of SNPs. We call the statistic the linear trend test allowing for error, applied to NGS, or LTTae,NGS. This statistic allows for differential misclassification. The observed data are phenotypes for unrelated cases and controls, coverage, and the number of putative causal variants for every individual at all SNPs. We simulate data considering multiple factors (disease mode of inheritance, genotype relative risk, causal variant frequency, sequence error rate in cases, sequence error rate in controls, number of loci, and others) and evaluate type I error rate and power for each vector of factor settings. We compare our results with two recently published NGS statistics. Also, we create a fictitious disease model, based on downloaded 1000 Genomes data for 5 SNPs and 388 individuals, and apply our statistic to that data. We find that the LTTae,NGS maintains the correct type I error rate in all simulations (differential and non-differential error), while the other statistics show large inflation in type I error for lower coverage. Power for all three methods is approximately the same for all three statistics in the presence of non-differential error. Application of our statistic to the 1000 Genomes data suggests that, for the data downloaded, there is a 1.5% sequence misclassification rate over all SNPs. Finally, application of the multi-variant form of LTTae,NGS shows high power for a number of simulation settings, although it can have lower power than the corresponding single variant simulation results, most probably due to our specification of multi-variant SNP correlation values. In conclusion, our LTTae,NGS addresses two key challenges with NGS disease studies; first, it allows for differential misclassification when computing the statistic; and second, it addresses the multiple-testing issue in that there is a multi-variant form of the statistic that has only one degree of freedom, and provides a single p-value, no matter how many loci. PMID:23594495
Replication Unreliability in Psychology: Elusive Phenomena or “Elusive” Statistical Power?

PubMed Central

Tressoldi, Patrizio E.

2012-01-01

The focus of this paper is to analyze whether the unreliability of results related to certain controversial psychological phenomena may be a consequence of their low statistical power. Applying the Null Hypothesis Statistical Testing (NHST), still the widest used statistical approach, unreliability derives from the failure to refute the null hypothesis, in particular when exact or quasi-exact replications of experiments are carried out. Taking as example the results of meta-analyses related to four different controversial phenomena, subliminal semantic priming, incubation effect for problem solving, unconscious thought theory, and non-local perception, it was found that, except for semantic priming on categorization, the statistical power to detect the expected effect size (ES) of the typical study, is low or very low. The low power in most studies undermines the use of NHST to study phenomena with moderate or low ESs. We conclude by providing some suggestions on how to increase the statistical power or use different statistical approaches to help discriminate whether the results obtained may or may not be used to support or to refute the reality of a phenomenon with small ES. PMID:22783215
Terrestrial photovoltaic cell process testing

NASA Technical Reports Server (NTRS)

Burger, D. R.

1985-01-01

The paper examines critical test parameters, criteria for selecting appropriate tests, and the use of statistical controls and test patterns to enhance PV-cell process test results. The coverage of critical test parameters is evaluated by examining available test methods and then screening these methods by considering the ability to measure those critical parameters which are most affected by the generic process, the cost of the test equipment and test performance, and the feasibility for process testing.
Terrestrial photovoltaic cell process testing

NASA Astrophysics Data System (ADS)

Burger, D. R.

The paper examines critical test parameters, criteria for selecting appropriate tests, and the use of statistical controls and test patterns to enhance PV-cell process test results. The coverage of critical test parameters is evaluated by examining available test methods and then screening these methods by considering the ability to measure those critical parameters which are most affected by the generic process, the cost of the test equipment and test performance, and the feasibility for process testing.
Statistical equilibrium in cometary C2. II - Swan/Phillips band ratios

NASA Technical Reports Server (NTRS)

Swamy, K. S. K.; Odell, C. R.

1979-01-01

Statistical equilibrium calculations have been made for both the triplet and ground state singlets for C2 in comets, using the exchange rate as a free parameter. The predictions of the results are consistent with optical observations and may be tested definitively by accurate observations of the Phillips and Swan band ratios. Comparison with the one reported observation indicates compatibility with a low exchange rate and resonance fluorescence statistical equilibrium.
Evaluation of Cepstrum Algorithm with Impact Seeded Fault Data of Helicopter Oil Cooler Fan Bearings and Machine Fault Simulator Data

DTIC Science & Technology

2013-02-01

of a bearing must be put into practice. There are many potential methods, the most traditional being the use of statistical time-domain features...accelerate degradation to test multiples bearings to gain statistical relevance and extrapolate results to scale for field conditions. Temperature...as time statistics , frequency estimation to improve the fault frequency detection. For future investigations, one can further explore the
Comparative evaluation of the results of three techniques in the reconstruction of the anterior cruciate ligament, with a minimum follow-up of two years.

PubMed

Cury, Ricardo de Paula Leite; Sprey, Jan Willem Cerf; Bragatto, André Luiz Lima; Mansano, Marcelo Valentim; Moscovici, Herman Fabian; Guglielmetti, Luiz Gabriel Betoni

2017-01-01

To compare the clinical results of the reconstruction of the anterior cruciate ligament by transtibial, transportal, and outside-in techniques. This was a retrospective study on 90 patients (ACL reconstruction with autologous flexor tendons) operated between August 2009 and June 2012, by the medial transportal (30), transtibial (30), and "outside-in" (30) techniques. The following parameters were assessed: objective and subjective IKDC, Lysholm, KT1000, Lachman test, Pivot-Shift and anterior drawer test. On physical examination, the Lachman test and Pivot-Shift indicated a slight superiority of the outside-in technique, but without statistical significance ( p = 0.132 and p = 0.186 respectively). The anterior drawer, KT1000, subjective IKDC, Lysholm, and objective IKDC tests showed similar results in the groups studied. A higher number of complications were observed in the medial transportal technique ( p = 0.033). There were no statistically significant differences in the clinical results of patients undergoing reconstruction of the anterior cruciate ligament by transtibial, medial transportal, and outside-in techniques.

Quality of reporting statistics in two Indian pharmacology journals.

PubMed

Jaykaran; Yadav, Preeti

2011-04-01

To evaluate the reporting of the statistical methods in articles published in two Indian pharmacology journals. All original articles published since 2002 were downloaded from the journals' (Indian Journal of Pharmacology (IJP) and Indian Journal of Physiology and Pharmacology (IJPP)) website. These articles were evaluated on the basis of appropriateness of descriptive statistics and inferential statistics. Descriptive statistics was evaluated on the basis of reporting of method of description and central tendencies. Inferential statistics was evaluated on the basis of fulfilling of assumption of statistical methods and appropriateness of statistical tests. Values are described as frequencies, percentage, and 95% confidence interval (CI) around the percentages. Inappropriate descriptive statistics was observed in 150 (78.1%, 95% CI 71.7-83.3%) articles. Most common reason for this inappropriate descriptive statistics was use of mean ± SEM at the place of "mean (SD)" or "mean ± SD." Most common statistical method used was one-way ANOVA (58.4%). Information regarding checking of assumption of statistical test was mentioned in only two articles. Inappropriate statistical test was observed in 61 (31.7%, 95% CI 25.6-38.6%) articles. Most common reason for inappropriate statistical test was the use of two group test for three or more groups. Articles published in two Indian pharmacology journals are not devoid of statistical errors.
LandScape: a simple method to aggregate p-values and other stochastic variables without a priori grouping.

PubMed

Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob

2016-08-01

In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).
Visual-Motor Test Performance: Race and Achievement Variables.

ERIC Educational Resources Information Center

Fuller, Gerald B.; Friedrich, Douglas

1979-01-01

Rural Black and White children of variant academic achievement were tested on the Minnesota Percepto-Diagnostic Test, which consists of six gestalt designs for the subject to copy. Analyses resulted only in a significant achievement effect; when intellectual level was statistically controlled, race was not a significant variable. (Editor/SJL)
Plant selection for ethnobotanical uses on the Amalfi Coast (Southern Italy).

PubMed

Savo, V; Joy, R; Caneva, G; McClatchey, W C

2015-07-15

Many ethnobotanical studies have investigated selection criteria for medicinal and non-medicinal plants. In this paper we test several statistical methods using different ethnobotanical datasets in order to 1) define to which extent the nature of the datasets can affect the interpretation of results; 2) determine if the selection for different plant uses is based on phylogeny, or other selection criteria. We considered three different ethnobotanical datasets: two datasets of medicinal plants and a dataset of non-medicinal plants (handicraft production, domestic and agro-pastoral practices) and two floras of the Amalfi Coast. We performed residual analysis from linear regression, the binomial test and the Bayesian approach for calculating under-used and over-used plant families within ethnobotanical datasets. Percentages of agreement were calculated to compare the results of the analyses. We also analyzed the relationship between plant selection and phylogeny, chorology, life form and habitat using the chi-square test. Pearson's residuals for each of the significant chi-square analyses were examined for investigating alternative hypotheses of plant selection criteria. The three statistical analysis methods differed within the same dataset, and between different datasets and floras, but with some similarities. In the two medicinal datasets, only Lamiaceae was identified in both floras as an over-used family by all three statistical methods. All statistical methods in one flora agreed that Malvaceae was over-used and Poaceae under-used, but this was not found to be consistent with results of the second flora in which one statistical result was non-significant. All other families had some discrepancy in significance across methods, or floras. Significant over- or under-use was observed in only a minority of cases. The chi-square analyses were significant for phylogeny, life form and habitat. Pearson's residuals indicated a non-random selection of woody species for non-medicinal uses and an under-use of plants of temperate forests for medicinal uses. Our study showed that selection criteria for plant uses (including medicinal) are not always based on phylogeny. The comparison of different statistical methods (regression, binomial and Bayesian) under different conditions led to the conclusion that the most conservative results are obtained using regression analysis.
Data processing of qualitative results from an interlaboratory comparison for the detection of “Flavescence dorée” phytoplasma: How the use of statistics can improve the reliability of the method validation process in plant pathology

PubMed Central

Renaudin, Isabelle; Poliakoff, Françoise

2017-01-01

A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of “Flavescence dorée” (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes’ theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods. PMID:28384335
Data processing of qualitative results from an interlaboratory comparison for the detection of "Flavescence dorée" phytoplasma: How the use of statistics can improve the reliability of the method validation process in plant pathology.

PubMed

Chabirand, Aude; Loiseau, Marianne; Renaudin, Isabelle; Poliakoff, Françoise

2017-01-01

A working group established in the framework of the EUPHRESCO European collaborative project aimed to compare and validate diagnostic protocols for the detection of "Flavescence dorée" (FD) phytoplasma in grapevines. Seven molecular protocols were compared in an interlaboratory test performance study where each laboratory had to analyze the same panel of samples consisting of DNA extracts prepared by the organizing laboratory. The tested molecular methods consisted of universal and group-specific real-time and end-point nested PCR tests. Different statistical approaches were applied to this collaborative study. Firstly, there was the standard statistical approach consisting in analyzing samples which are known to be positive and samples which are known to be negative and reporting the proportion of false-positive and false-negative results to respectively calculate diagnostic specificity and sensitivity. This approach was supplemented by the calculation of repeatability and reproducibility for qualitative methods based on the notions of accordance and concordance. Other new approaches were also implemented, based, on the one hand, on the probability of detection model, and, on the other hand, on Bayes' theorem. These various statistical approaches are complementary and give consistent results. Their combination, and in particular, the introduction of new statistical approaches give overall information on the performance and limitations of the different methods, and are particularly useful for selecting the most appropriate detection scheme with regards to the prevalence of the pathogen. Three real-time PCR protocols (methods M4, M5 and M6 respectively developed by Hren (2007), Pelletier (2009) and under patent oligonucleotides) achieved the highest levels of performance for FD phytoplasma detection. This paper also addresses the issue of indeterminate results and the identification of outlier results. The statistical tools presented in this paper and their combination can be applied to many other studies concerning plant pathogens and other disciplines that use qualitative detection methods.
Comparison of estrogen receptor results from pathology reports with results from central laboratory testing.

PubMed

Collins, Laura C; Marotti, Jonathan D; Baer, Heather J; Tamimi, Rulla M

2008-02-06

We compared estrogen receptor (ER) assay results abstracted from pathology reports with ER results determined on the same specimens by a central laboratory with an immunohistochemical assay. Paraffin sections were cut from tissue microarrays containing 3093 breast cancer specimens from women enrolled in the Nurses' Health Study, 1851 of which had both pathology reports and tissue available for central laboratory testing. All sections were immunostained for ER at the same time. The original assays were biochemical for 1512 (81.7%) of the 1851 specimens, immunohistochemical for 336 (18.2%), and immunofluorescent for three (0.2%). ER results from pathology reports and repeat central laboratory testing were in agreement for 87.3% of specimens (1615 of the 1851 specimens; kappa statistic = 0.64, P < .001). When the comparison was restricted to the specimens for which the ER assays were originally performed by immunohistochemistry, the agreement rate increased to 92.3% of specimens (310 of the 336 specimens; kappa statistic = 0.78, P < .001). Thus, ER assay results from pathology reports appear to be a reasonable alternative to central laboratory ER testing for large, population-based studies of patients with breast cancer.
A Method for Assessing Change in Attitude: The McNemar Test.

ERIC Educational Resources Information Center

Ciechalski, Joseph C.; Pinkney, James W.; Weaver, Florence S.

This paper illustrates the use of the McNemar Test, using a hypothetical problem. The McNemar Test is a nonparametric statistical test that is a type of chi square test using dependent, rather than independent, samples to assess before-after designs in which each subject is used as his or her own control. Results of the McNemar test make it…
C3H7NO2S effect on concrete steel-rebar corrosion in 0.5 M H2SO4 simulating industrial/microbial environment

NASA Astrophysics Data System (ADS)

Okeniyi, Joshua Olusegun; Nwadialo, Christopher Chukwuweike; Olu-Steven, Folusho Emmanuel; Ebinne, Samaru Smart; Coker, Taiwo Ebenezer; Okeniyi, Elizabeth Toyin; Ogbiye, Adebanji Samuel; Durotoye, Taiwo Omowunmi; Badmus, Emmanuel Omotunde Oluwasogo

2017-02-01

This paper investigates C3H7NO2S (Cysteine) effect on the inhibition of reinforcing steel corrosion in concrete immersed in 0.5 M H2SO4, for simulating industrial/microbial environment. Different C3H7NO2S concentrations were admixed, in duplicates, in steel-reinforced concrete samples that were partially immersed in the acidic sulphate environment. Electrochemical monitoring techniques of open circuit potential, as per ASTM C876-91 R99, and corrosion rate, by linear polarization resistance, were then employed for studying anticorrosion effect in steel-reinforced concrete samples by the organic hydrocarbon admixture. Analyses of electrochemical test-data followed ASTM G16-95 R04 prescriptions including probability distribution modeling with significant testing by Kolmogorov-Smirnov and student's t-tests statistics. Results established that all datasets of corrosion potential distributed like the Normal, the Gumbel and the Weibull distributions but that only the Weibull model described all the corrosion rate datasets in the study, as per the Kolmogorov-Smirnov test-statistics. Results of the student's t-test showed that differences of corrosion test-data between duplicated samples with the same C3H7NO2S concentrations were not statistically significant. These results indicated that 0.06878 M C3H7NO2S exhibited optimal inhibition efficiency η = 90.52±1.29% on reinforcing steel corrosion in the concrete samples immersed in 0.5 M H2SO4, simulating industrial/microbial service-environment.
Quantifying the evolution of flow boiling bubbles by statistical testing and image analysis: toward a general model.

PubMed

Xiao, Qingtai; Xu, Jianxin; Wang, Hua

2016-08-16

A new index, the estimate of the error variance, which can be used to quantify the evolution of the flow patterns when multiphase components or tracers are difficultly distinguishable, was proposed. The homogeneity degree of the luminance space distribution behind the viewing windows in the direct contact boiling heat transfer process was explored. With image analysis and a linear statistical model, the F-test of the statistical analysis was used to test whether the light was uniform, and a non-linear method was used to determine the direction and position of a fixed source light. The experimental results showed that the inflection point of the new index was approximately equal to the mixing time. The new index has been popularized and applied to a multiphase macro mixing process by top blowing in a stirred tank. Moreover, a general quantifying model was introduced for demonstrating the relationship between the flow patterns of the bubble swarms and heat transfer. The results can be applied to investigate other mixing processes that are very difficult to recognize the target.
Quantifying the evolution of flow boiling bubbles by statistical testing and image analysis: toward a general model

PubMed Central

Xiao, Qingtai; Xu, Jianxin; Wang, Hua

2016-01-01

A new index, the estimate of the error variance, which can be used to quantify the evolution of the flow patterns when multiphase components or tracers are difficultly distinguishable, was proposed. The homogeneity degree of the luminance space distribution behind the viewing windows in the direct contact boiling heat transfer process was explored. With image analysis and a linear statistical model, the F-test of the statistical analysis was used to test whether the light was uniform, and a non-linear method was used to determine the direction and position of a fixed source light. The experimental results showed that the inflection point of the new index was approximately equal to the mixing time. The new index has been popularized and applied to a multiphase macro mixing process by top blowing in a stirred tank. Moreover, a general quantifying model was introduced for demonstrating the relationship between the flow patterns of the bubble swarms and heat transfer. The results can be applied to investigate other mixing processes that are very difficult to recognize the target. PMID:27527065
Evaluation of Three Different Processing Techniques in the Fabrication of Complete Dentures

PubMed Central

Chintalacheruvu, Vamsi Krishna; Balraj, Rajasekaran Uttukuli; Putchala, Lavanya Sireesha; Pachalla, Sreelekha

2017-01-01

Aims and Objectives: The objective of the present study is to compare the effectiveness of three different processing techniques and to find out the accuracy of processing techniques through number of occlusal interferences and increase in vertical dimension after denture processing. Materials and Methods: A cross-sectional study was conducted on a sample of 18 patients indicated for complete denture fabrication was selected for the study and they were divided into three subgroups. Three processing techniques, compression molding and injection molding using prepolymerized resin and unpolymerized resin, were used to fabricate dentures for each of the groups. After processing, laboratory-remounted dentures were evaluated for number of occlusal interferences in centric and eccentric relations and change in vertical dimension through vertical pin rise in articulator. Data were analyzed using statistical test ANOVA and SPSS software version 19.0 by IBM was used. Results: Data obtained from three groups were subjected to one-way ANOVA test. After ANOVA test, results with significant variations were subjected to post hoc test. Number of occlusal interferences with compression molding technique was reported to be more in both centric and eccentric positions as compared to the two injection molding techniques with statistical significance in centric, protrusive, right lateral nonworking, and left lateral working positions (P < 0.05). Mean vertical pin rise (0.52 mm) was reported to more in compression molding technique as compared to injection molding techniques, which is statistically significant (P < 0.001). Conclusions: Within the limitations of this study, injection molding techniques exhibited less processing errors as compared to compression molding technique with statistical significance. There was no statistically significant difference in processing errors reported within two injection molding systems. PMID:28713763
Rodent Biocompatibility Test Using the NASA Foodbar and Epoxy EP21LV

NASA Technical Reports Server (NTRS)

Tillman, J.; Steele, M.; Dumars, P.; Vasques, M.; Girten, B.; Sun, S. (Technical Monitor)

2002-01-01

Epoxy has been used successfully to affix NASA foodbars to the inner walls of the Animal Enclosure Module for past space flight experiments utilizing rodents. The epoxy used on past missions was discontinued, making it necessary to identify a new epoxy for use on the STS-108 and STS-107 missions. This experiment was designed to test the basic biocompatibility of epoxy EP21LV with male rats (Sprague Dawley) and mice (Swiss Webster) when applied to NASA foodbars. For each species, the test was conducted with a control group fed untreated foodbars and an experimental group fed foodbars applied with EP21LV. For each species, there were no group differences in animal health and no statistical differences (P<0.05) in body weights throughout the study. In mice, there was a 16% increase in heart weight in the epoxy group; this result was not found in rats. For both species, there were no statistical differences found in other organ weights measured. In rats, blood glucose levels were 15% higher and both total protein and globulin were 10% lower in the epoxy group. Statistical differences in these parameters were not found in mice. For both species, no statistical differences were found in other blood parameters tested. Food consumption was not different in rats but water consumption was significantly decreased 10 to 15% in the epoxy group. The difference in water consumption is likely due to an increased water content of the epoxy-treated foodbars. Finally, both species avoided consumption of the epoxy material. Based on the global analysis of the results, the few parameters found to be statistically different do not appear to be a physiologically relevant effect of the epoxy material, We conclude that the EP21LV epoxy is biocompatible with rodents.
Statistical Inference for Quality-Adjusted Survival Time

DTIC Science & Technology

2003-08-01

survival functions of QAL. If an influence function for a test statistic exists for complete data case, denoted as ’i, then a test statistic for...the survival function for the censoring variable. Zhao and Tsiatis (2001) proposed a test statistic where O is the influence function of the general...to 1 everywhere until a subject’s death. We have considered other forms of test statistics. One option is to use an influence function 0i that is
The t-test: An Influential Inferential Tool in Chaplaincy and Other Healthcare Research.

PubMed

Jankowski, Katherine R B; Flannelly, Kevin J; Flannelly, Laura T

2018-01-01

The t-test developed by William S. Gosset (also known as Student's t-test and the two-sample t-test) is commonly used to compare one sample mean on a measure with another sample mean on the same measure. The outcome of the t-test is used to draw inferences about how different the samples are from each other. It is probably one of the most frequently relied upon statistics in inferential research. It is easy to use: a researcher can calculate the statistic with three simple tools: paper, pen, and a calculator. A computer program can quickly calculate the t-test for large samples. The ease of use can result in the misuse of the t-test. This article discusses the development of the original t-test, basic principles of the t-test, two additional types of t-tests (the one-sample t-test and the paired t-test), and recommendations about what to consider when using the t-test to draw inferences in research.
Mapping cell populations in flow cytometry data for cross‐sample comparison using the Friedman–Rafsky test statistic as a distance measure

PubMed Central

Hsiao, Chiaowen; Liu, Mengya; Stanton, Rick; McGee, Monnie; Qian, Yu

2015-01-01

Abstract Flow cytometry (FCM) is a fluorescence‐based single‐cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap‐FR, a novel method for cell population mapping across FCM samples. FlowMap‐FR is based on the Friedman–Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap‐FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap‐FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap‐FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap‐FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap‐FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback–Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL‐distance in distinguishing equivalent from nonequivalent cell populations. FlowMap‐FR was also employed as a distance metric to match cell populations delineated by manual gating across 30 FCM samples from a benchmark FlowCAP data set. An F‐measure of 0.88 was obtained, indicating high precision and recall of the FR‐based population matching results. FlowMap‐FR has been implemented as a standalone R/Bioconductor package so that it can be easily incorporated into current FCM data analytical workflows. © 2015 International Society for Advancement of Cytometry PMID:26274018
Mapping cell populations in flow cytometry data for cross-sample comparison using the Friedman-Rafsky test statistic as a distance measure.

PubMed

Hsiao, Chiaowen; Liu, Mengya; Stanton, Rick; McGee, Monnie; Qian, Yu; Scheuermann, Richard H

2016-01-01

Flow cytometry (FCM) is a fluorescence-based single-cell experimental technology that is routinely applied in biomedical research for identifying cellular biomarkers of normal physiological responses and abnormal disease states. While many computational methods have been developed that focus on identifying cell populations in individual FCM samples, very few have addressed how the identified cell populations can be matched across samples for comparative analysis. This article presents FlowMap-FR, a novel method for cell population mapping across FCM samples. FlowMap-FR is based on the Friedman-Rafsky nonparametric test statistic (FR statistic), which quantifies the equivalence of multivariate distributions. As applied to FCM data by FlowMap-FR, the FR statistic objectively quantifies the similarity between cell populations based on the shapes, sizes, and positions of fluorescence data distributions in the multidimensional feature space. To test and evaluate the performance of FlowMap-FR, we simulated the kinds of biological and technical sample variations that are commonly observed in FCM data. The results show that FlowMap-FR is able to effectively identify equivalent cell populations between samples under scenarios of proportion differences and modest position shifts. As a statistical test, FlowMap-FR can be used to determine whether the expression of a cellular marker is statistically different between two cell populations, suggesting candidates for new cellular phenotypes by providing an objective statistical measure. In addition, FlowMap-FR can indicate situations in which inappropriate splitting or merging of cell populations has occurred during gating procedures. We compared the FR statistic with the symmetric version of Kullback-Leibler divergence measure used in a previous population matching method with both simulated and real data. The FR statistic outperforms the symmetric version of KL-distance in distinguishing equivalent from nonequivalent cell populations. FlowMap-FR was also employed as a distance metric to match cell populations delineated by manual gating across 30 FCM samples from a benchmark FlowCAP data set. An F-measure of 0.88 was obtained, indicating high precision and recall of the FR-based population matching results. FlowMap-FR has been implemented as a standalone R/Bioconductor package so that it can be easily incorporated into current FCM data analytical workflows. © The Authors. Published by Wiley Periodicals, Inc. on behalf of ISAC.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sandstrom, Mary M.; Brown, Geoffrey W.; Preston, Daniel N.

2015-10-30

The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statisticallymore » significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.« less
Biomechanical In Vitro - Stability Testing on Human Specimens of a Locking Plate System Against Conventional Screw Fixation of a Proximal First Metatarsal Lateral Displacement Osteotomy

PubMed Central

Arnold, Heino; Stukenborg-Colsman, Christina; Hurschler, Christof; Seehaus, Frank; Bobrowitsch, Evgenij; Waizy, Hazibullah

2012-01-01

Introduction: The aim of this study was to examine resistance to angulation and displacement of the internal fixation of a proximal first metatarsal lateral displacement osteotomy, using a locking plate system compared with a conventional crossed screw fixation. Materials and Methodology: Seven anatomical human specimens were tested. Each specimen was tested with a locking screw plate as well as a crossed cancellous srew fixation. The statistical analysis was performed by the Friedman test. The level of significance was p = 0.05. Results: We found larger stability about all three axes of movement analyzed for the PLATE than the crossed screws osteosynthesis (CSO). The Friedman test showed statistical significance at a level of p = 0.05 for all groups and both translational and rotational movements. Conclusion: The results of our study confirm that the fixation of the lateral proximal first metatarsal displacement osteotomy with a locking plate fixation is a technically simple procedure of superior stability. PMID:22675409
Robust inference from multiple test statistics via permutations: a better alternative to the single test statistic approach for randomized trials.

PubMed

Ganju, Jitendra; Yu, Xinxin; Ma, Guoguang Julie

2013-01-01

Formal inference in randomized clinical trials is based on controlling the type I error rate associated with a single pre-specified statistic. The deficiency of using just one method of analysis is that it depends on assumptions that may not be met. For robust inference, we propose pre-specifying multiple test statistics and relying on the minimum p-value for testing the null hypothesis of no treatment effect. The null hypothesis associated with the various test statistics is that the treatment groups are indistinguishable. The critical value for hypothesis testing comes from permutation distributions. Rejection of the null hypothesis when the smallest p-value is less than the critical value controls the type I error rate at its designated value. Even if one of the candidate test statistics has low power, the adverse effect on the power of the minimum p-value statistic is not much. Its use is illustrated with examples. We conclude that it is better to rely on the minimum p-value rather than a single statistic particularly when that single statistic is the logrank test, because of the cost and complexity of many survival trials. Copyright © 2013 John Wiley & Sons, Ltd.

Assembling a Computerized Adaptive Testing Item Pool as a Set of Linear Tests

ERIC Educational Resources Information Center

van der Linden, Wim J.; Ariel, Adelaide; Veldkamp, Bernard P.

2006-01-01

Test-item writing efforts typically results in item pools with an undesirable correlational structure between the content attributes of the items and their statistical information. If such pools are used in computerized adaptive testing (CAT), the algorithm may be forced to select items with less than optimal information, that violate the content…
An Examination of the Effects of Parental Involvement/Intervention on Student Development at the College/University Level

ERIC Educational Resources Information Center

Touchette, Timothy M.

2013-01-01

This doctoral thesis contributes to the literature on helicopter parents, and their relation to student development theory. A secondary examination of approximately 1800 randomized results from the 2007 National Survey of Student Engagement (NSSE) was tested using the following statistical tests: Mann-Whitney Test, Wilcoxon Signed Rank Test,…
Improved Statistics for Genome-Wide Interaction Analysis

PubMed Central

Ueki, Masao; Cordell, Heather J.

2012-01-01

Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new “joint effects” statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al.'s originally-proposed statistics, on account of the inflated error rate that can result. PMID:22496670
Wilcoxon's signed-rank statistic: what null hypothesis and why it matters.

PubMed

Li, Heng; Johnson, Terri

2014-01-01

In statistical literature, the term 'signed-rank test' (or 'Wilcoxon signed-rank test') has been used to refer to two distinct tests: a test for symmetry of distribution and a test for the median of a symmetric distribution, sharing a common test statistic. To avoid potential ambiguity, we propose to refer to those two tests by different names, as 'test for symmetry based on signed-rank statistic' and 'test for median based on signed-rank statistic', respectively. The utility of such terminological differentiation should become evident through our discussion of how those tests connect and contrast with sign test and one-sample t-test. Published 2014. This article is a U.S. Government work and is in the public domain in the USA. Published 2014. This article is a U.S. Government work and is in the public domain in the USA.
Effectiveness of groundwater governance structures and institutions in Tanzania

NASA Astrophysics Data System (ADS)

Gudaga, J. L.; Kabote, S. J.; Tarimo, A. K. P. R.; Mosha, D. B.; Kashaigili, J. J.

2018-05-01

This paper examines effectiveness of groundwater governance structures and institutions in Mbarali District, Mbeya Region. The paper adopts exploratory sequential research design to collect quantitative and qualitative data. A random sample of 90 groundwater users with 50% women was involved in the survey. Descriptive statistics, Kruskal-Wallis H test and Mann-Whitney U test were used to compare the differences in responses between groups, while qualitative data were subjected to content analysis. The results show that the Village Councils and Community Water Supply Organizations (COWSOs) were effective in governing groundwater. The results also show statistical significant difference on the overall extent of effectiveness of the Village Councils in governing groundwater between villages ( P = 0.0001), yet there was no significant difference ( P > 0.05) between male and female responses on the effectiveness of Village Councils, village water committees and COWSOs. The Mann-Whitney U test showed statistical significant difference between male and female responses on effectiveness of formal and informal institutions ( P = 0.0001), such that informal institutions were effective relative to formal institutions. The Kruskal-Wallis H test also showed statistical significant difference ( P ≤ 0.05) on the extent of effectiveness of formal institutions, norms and values between low, medium and high categories. The paper concludes that COWSOs were more effective in governing groundwater than other groundwater governance structures. Similarly, norms and values were more effective than formal institutions. The paper recommends sensitization and awareness creation on formal institutions so that they can influence water users' behaviour to govern groundwater.
Testing the Predictive Power of Coulomb Stress on Aftershock Sequences

NASA Astrophysics Data System (ADS)

Woessner, J.; Lombardi, A.; Werner, M. J.; Marzocchi, W.

2009-12-01

Empirical and statistical models of clustered seismicity are usually strongly stochastic and perceived to be uninformative in their forecasts, since only marginal distributions are used, such as the Omori-Utsu and Gutenberg-Richter laws. In contrast, so-called physics-based aftershock models, based on seismic rate changes calculated from Coulomb stress changes and rate-and-state friction, make more specific predictions: anisotropic stress shadows and multiplicative rate changes. We test the predictive power of models based on Coulomb stress changes against statistical models, including the popular Short Term Earthquake Probabilities and Epidemic-Type Aftershock Sequences models: We score and compare retrospective forecasts on the aftershock sequences of the 1992 Landers, USA, the 1997 Colfiorito, Italy, and the 2008 Selfoss, Iceland, earthquakes. To quantify predictability, we use likelihood-based metrics that test the consistency of the forecasts with the data, including modified and existing tests used in prospective forecast experiments within the Collaboratory for the Study of Earthquake Predictability (CSEP). Our results indicate that a statistical model performs best. Moreover, two Coulomb model classes seem unable to compete: Models based on deterministic Coulomb stress changes calculated from a given fault-slip model, and those based on fixed receiver faults. One model of Coulomb stress changes does perform well and sometimes outperforms the statistical models, but its predictive information is diluted, because of uncertainties included in the fault-slip model. Our results suggest that models based on Coulomb stress changes need to incorporate stochastic features that represent model and data uncertainty.
Preliminary criteria for the definition of allergic rhinitis: a systematic evaluation of clinical parameters in a disease cohort (I).

PubMed

Ng, M L; Warlow, R S; Chrishanthan, N; Ellis, C; Walls, R

2000-09-01

The aim of this study is to formulate criteria for the definition of allergic rhinitis. Other studies have sought to develop scoring systems to categorize the severity of allergic rhinitis symptoms but it was never used for the formulation of diagnostic criteria. These other scoring systems were arbitrarily chosen and were not derived by any statistical analysis. To date, a study of this kind has not been performed. The hypothesis of this study is that it is possible to formulate criteria for the definition of allergic rhinitis. This is the first study to systematically examine and evaluate the relative importance of symptoms, signs and investigative tests in allergic rhinitis. We sought to statistically rank, from the most to the least important, the multiplicity of symptoms, signs and test results. Forty-seven allergic rhinitis and 23 normal subjects were evaluated with a detailed questionnaire and history, physical examination, serum total immunoglobulin E, skin prick tests and serum enzyme allergosorbent tests (EAST). Statistical ranking of variables indicated rhinitis symptoms (nasal, ocular and oronasal) were the most commonly occurring, followed by a history of allergen provocation, then serum total IgE, positive skin prick tests and positive EAST's to house dust mite, perennial rye and bermuda/couch grass. Throat symptoms ranked even lower whilst EAST's to cat epithelia, plantain and cockroach were the least important. Not all symptoms, signs and tests evaluated proved to be statistically significant when compared to a control group; this included symtoms and signs which had been considered historically to be traditionally associated with allergic rhinitis, e.g. sore throat and bleeding nose. In performing statistical analyses, we were able to rank from most to least important, the multiplicity of symptoms signs and test results. The most important symptoms and signs were identified for the first time, even though some of these were not included in our original selection criteria for defining the disease cohort i.e. sniffing, postnasal drip, oedematous nasal mucosa, impaired sense of smell, mouth breathing, itchy nose and many of the specific provocation factors.
Learning a Living: First Results of the Adult Literacy and Life Skills Survey

ERIC Educational Resources Information Center

OECD Publishing (NJ1), 2005

2005-01-01

The Adult Literacy and Life Skills Survey (ALL) is a large-scale co-operative effort undertaken by governments, national statistics agencies, research institutions and multi-lateral agencies. The development and management of the study were co-ordinated by Statistics Canada and the Educational Testing Service (ETS) in collaboration with the…
Comparing the Lifetimes of Two Brands of Batteries

ERIC Educational Resources Information Center

Dunn, Peter K.

2013-01-01

In this paper, we report a case study that illustrates the importance in interpreting the results from statistical tests, and shows the difference between practical importance and statistical significance. This case study presents three sets of data concerning the performance of two brands of batteries. The data are easy to describe and…
The Development and Demonstration of Multiple Regression Models for Operant Conditioning Questions.

ERIC Educational Resources Information Center

Fanning, Fred; Newman, Isadore

Based on the assumption that inferential statistics can make the operant conditioner more sensitive to possible significant relationships, regressions models were developed to test the statistical significance between slopes and Y intercepts of the experimental and control group subjects. These results were then compared to the traditional operant…
Knowledge-Sharing Intention among Information Professionals in Nigeria: A Statistical Analysis

ERIC Educational Resources Information Center

Tella, Adeyinka

2016-01-01

In this study, the researcher administered a survey and developed and tested a statistical model to examine the factors that determine the intention of information professionals in Nigeria to share knowledge with their colleagues. The result revealed correlations between the overall score for intending to share knowledge and other…
Multiple choice answers: what to do when you have too many questions.

PubMed

Jupiter, Daniel C

2015-01-01

Carrying out too many statistical tests in a single study throws results into doubt, for reasons statistical and ethical. I discuss why this is the case and briefly mention ways to handle the problem. Copyright © 2015 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
An intelligent system based on fuzzy probabilities for medical diagnosis– a study in aphasia diagnosis*

PubMed Central

Moshtagh-Khorasani, Majid; Akbarzadeh-T, Mohammad-R; Jahangiri, Nader; Khoobdel, Mehdi

2009-01-01

BACKGROUND: Aphasia diagnosis is particularly challenging due to the linguistic uncertainty and vagueness, inconsistencies in the definition of aphasic syndromes, large number of measurements with imprecision, natural diversity and subjectivity in test objects as well as in opinions of experts who diagnose the disease. METHODS: Fuzzy probability is proposed here as the basic framework for handling the uncertainties in medical diagnosis and particularly aphasia diagnosis. To efficiently construct this fuzzy probabilistic mapping, statistical analysis is performed that constructs input membership functions as well as determines an effective set of input features. RESULTS: Considering the high sensitivity of performance measures to different distribution of testing/training sets, a statistical t-test of significance is applied to compare fuzzy approach results with NN results as well as author's earlier work using fuzzy logic. The proposed fuzzy probability estimator approach clearly provides better diagnosis for both classes of data sets. Specifically, for the first and second type of fuzzy probability classifiers, i.e. spontaneous speech and comprehensive model, P-values are 2.24E-08 and 0.0059, respectively, strongly rejecting the null hypothesis. CONCLUSIONS: The technique is applied and compared on both comprehensive and spontaneous speech test data for diagnosis of four Aphasia types: Anomic, Broca, Global and Wernicke. Statistical analysis confirms that the proposed approach can significantly improve accuracy using fewer Aphasia features. PMID:21772867
Application of the Bootstrap Statistical Method in Deriving Vibroacoustic Specifications

NASA Technical Reports Server (NTRS)

Hughes, William O.; Paez, Thomas L.

2006-01-01

This paper discusses the Bootstrap Method for specification of vibroacoustic test specifications. Vibroacoustic test specifications are necessary to properly accept or qualify a spacecraft and its components for the expected acoustic, random vibration and shock environments seen on an expendable launch vehicle. Traditionally, NASA and the U.S. Air Force have employed methods of Normal Tolerance Limits to derive these test levels based upon the amount of data available, and the probability and confidence levels desired. The Normal Tolerance Limit method contains inherent assumptions about the distribution of the data. The Bootstrap is a distribution-free statistical subsampling method which uses the measured data themselves to establish estimates of statistical measures of random sources. This is achieved through the computation of large numbers of Bootstrap replicates of a data measure of interest and the use of these replicates to derive test levels consistent with the probability and confidence desired. The comparison of the results of these two methods is illustrated via an example utilizing actual spacecraft vibroacoustic data.
A benchmark for statistical microarray data analysis that preserves actual biological and technical variance.

PubMed

De Hertogh, Benoît; De Meulder, Bertrand; Berger, Fabrice; Pierre, Michael; Bareke, Eric; Gaigneaux, Anthoula; Depiereux, Eric

2010-01-11

Recent reanalysis of spike-in datasets underscored the need for new and more accurate benchmark datasets for statistical microarray analysis. We present here a fresh method using biologically-relevant data to evaluate the performance of statistical methods. Our novel method ranks the probesets from a dataset composed of publicly-available biological microarray data and extracts subset matrices with precise information/noise ratios. Our method can be used to determine the capability of different methods to better estimate variance for a given number of replicates. The mean-variance and mean-fold change relationships of the matrices revealed a closer approximation of biological reality. Performance analysis refined the results from benchmarks published previously.We show that the Shrinkage t test (close to Limma) was the best of the methods tested, except when two replicates were examined, where the Regularized t test and the Window t test performed slightly better. The R scripts used for the analysis are available at http://urbm-cluster.urbm.fundp.ac.be/~bdemeulder/.
On use of the multistage dose-response model for assessing laboratory animal carcinogenicity

PubMed Central

Nitcheva, Daniella; Piegorsch, Walter W.; West, R. Webster

2007-01-01

We explore how well a statistical multistage model describes dose-response patterns in laboratory animal carcinogenicity experiments from a large database of quantal response data. The data are collected from the U.S. EPA’s publicly available IRIS data warehouse and examined statistically to determine how often higher-order values in the multistage predictor yield significant improvements in explanatory power over lower-order values. Our results suggest that the addition of a second-order parameter to the model only improves the fit about 20% of the time, while adding even higher-order terms apparently does not contribute to the fit at all, at least with the study designs we captured in the IRIS database. Also included is an examination of statistical tests for assessing significance of higher-order terms in a multistage dose-response model. It is noted that bootstrap testing methodology appears to offer greater stability for performing the hypothesis tests than a more-common, but possibly unstable, “Wald” test. PMID:17490794
Effectiveness of Quantitative Real Time PCR in Long-Term Follow-up of Chronic Myeloid Leukemia Patients.

PubMed

Savasoglu, Kaan; Payzin, Kadriye Bahriye; Ozdemirkiran, Fusun; Berber, Belgin

2015-08-01

To determine the use of the Quantitative Real Time PCR (RQ-PCR) assay follow-up with Chronic Myeloid Leukemia (CML) patients. Cross-sectional observational. Izmir Ataturk Education and Research Hospital, Izmir, Turkey, from 2009 to 2013. Cytogenetic, FISH, RQ-PCR test results from 177 CMLpatients' materials selected between 2009 - 2013 years was set up for comparison analysis. Statistical analysis was performed to compare between FISH, karyotype and RQ-PCR results of the patients. Karyotyping and FISH specificity and sensitivity rates determined by ROC analysis compared with RQ-PCR results. Chi-square test was used to compare test failure rates. Sensitivity and specificity values were determined for karyotyping 17.6 - 98% (p=0.118, p > 0.05) and for FISH 22.5 - 96% (p=0.064, p > 0.05) respectively. FISH sensitivity was slightly higher than karyotyping but there was calculated a strong correlation between them (p < 0.001). RQ-PCR test failure rate did not correlate with other two tests (p > 0.05); however, karyotyping and FISH test failure rate was statistically significant (p < 0.001). Besides, the situation needed for karyotype analysis, RQ-PCR assay can be used alone in the follow-up of CMLdisease.
Vibration and stretching effects on flexibility and explosive strength in young gymnasts.

PubMed

Kinser, Ann M; Ramsey, Michael W; O'Bryant, Harold S; Ayres, Christopher A; Sands, William A; Stone, Michael H

2008-01-01

Effects of simultaneous vibration-stretching on flexibility and explosive strength in competitive female gymnasts were examined. Twenty-two female athletes (age = 11.3 +/- 2.6 yr; body mass = 35.3 +/- 11.6 kg; competitive levels = 3-9) composed the simultaneous vibration-stretching (VS) group, which performed both tests. Flexibility testing control groups were stretching-only (SF) (N = 7) and vibration-only (VF) (N = 8). Explosive strength-control groups were stretching-only (SES) (N = 8) and vibration-only (VES) (N = 7). Vibration (30 Hz, 2-mm displacement) was applied to four sites, four times for 10 s, with 5 s of rest in between. Right and left forward-split (RFS and LFS) flexibility was measured by the distance between the ground and the anterior suprailiac spine. A force plate (sampling rate, 1000 Hz) recorded countermovement and static jump characteristics. Explosive strength variables included flight time, jump height, peak force, instantaneous forces, and rates of force development. Data were analyzed using Bonferroni adjusted paired t-tests. VS had statistically increased flexibility (P) and large effect sizes (d) in both the RFS (P = 1.28 x 10(-7), d = 0.67) and LFS (P = 2.35 x 10(-7), d = 0.72). VS had statistically different results of favored (FL) (P = 4.67 x 10(-8), d= 0.78) and nonfavored (NFL) (P = 7.97 x 10(-10), d = 0.65) legs. VF resulted in statistical increases in flexibility and medium d on RFS (P = 6.98 x 10(-3), d = 0.25) and statistically increased flexibility on VF NFL flexibility (P = 0.002, d = 0.31). SF had no statistical difference between measures and small d. For explosive strength, there were no statistical differences in variables in the VS, SES, and VES for the pre- versus posttreatment tests. Simultaneous vibration and stretching may greatly increase flexibility while not altering explosive strength.
The (mis)reporting of statistical results in psychology journals.

PubMed

Bakker, Marjan; Wicherts, Jelte M

2011-09-01

In order to study the prevalence, nature (direction), and causes of reporting errors in psychology, we checked the consistency of reported test statistics, degrees of freedom, and p values in a random sample of high- and low-impact psychology journals. In a second study, we established the generality of reporting errors in a random sample of recent psychological articles. Our results, on the basis of 281 articles, indicate that around 18% of statistical results in the psychological literature are incorrectly reported. Inconsistencies were more common in low-impact journals than in high-impact journals. Moreover, around 15% of the articles contained at least one statistical conclusion that proved, upon recalculation, to be incorrect; that is, recalculation rendered the previously significant result insignificant, or vice versa. These errors were often in line with researchers' expectations. We classified the most common errors and contacted authors to shed light on the origins of the errors.
Integrated Analysis of Pharmacologic, Clinical, and SNP Microarray Data using Projection onto the Most Interesting Statistical Evidence with Adaptive Permutation Testing

PubMed Central

Pounds, Stan; Cao, Xueyuan; Cheng, Cheng; Yang, Jun; Campana, Dario; Evans, William E.; Pui, Ching-Hon; Relling, Mary V.

2010-01-01

Powerful methods for integrated analysis of multiple biological data sets are needed to maximize interpretation capacity and acquire meaningful knowledge. We recently developed Projection Onto the Most Interesting Statistical Evidence (PROMISE). PROMISE is a statistical procedure that incorporates prior knowledge about the biological relationships among endpoint variables into an integrated analysis of microarray gene expression data with multiple biological and clinical endpoints. Here, PROMISE is adapted to the integrated analysis of pharmacologic, clinical, and genome-wide genotype data that incorporating knowledge about the biological relationships among pharmacologic and clinical response data. An efficient permutation-testing algorithm is introduced so that statistical calculations are computationally feasible in this higher-dimension setting. The new method is applied to a pediatric leukemia data set. The results clearly indicate that PROMISE is a powerful statistical tool for identifying genomic features that exhibit a biologically meaningful pattern of association with multiple endpoint variables. PMID:21516175

Fully Bayesian tests of neutrality using genealogical summary statistics.

PubMed

Drummond, Alexei J; Suchard, Marc A

2008-10-31

Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.
Does Assessing Eye Alignment along with Refractive Error or Visual Acuity Increase Sensitivity for Detection of Strabismus in Preschool Vision Screening?

PubMed Central

2007-01-01

Purpose Preschool vision screenings often include refractive error or visual acuity (VA) testing to detect amblyopia, as well as alignment testing to detect strabismus. The purpose of this study was to determine the effect of combining screening for eye alignment with screening for refractive error or reduced VA on sensitivity for detection of strabismus, with specificity set at 90% and 94%. Methods Over 3 years, 4040 preschool children were screened in the Vision in Preschoolers (VIP) Study, with different screening tests administered each year. Examinations were performed to identify children with strabismus. The best screening tests for detecting children with any targeted condition were noncycloplegic retinoscopy (NCR), Retinomax autorefractor (Right Manufacturing, Virginia Beach, VA), SureSight Vision Screener (Welch-Allyn, Inc., Skaneateles, NY), and Lea Symbols (Precision Vision, LaSalle, IL and Good-Lite Co., Elgin, IL) and HOTV optotypes VA tests. Analyses were conducted with these tests of refractive error or VA paired with the best tests for detecting strabismus (unilateral cover testing, Random Dot “E” [RDE] and Stereo Smile Test II [Stereo Optical, Inc., Chicago, IL]; and MTI PhotoScreener [PhotoScreener, Inc., Palm Beach, FL]). The change in sensitivity that resulted from combining a test of eye alignment with a test of refractive error or VA was determined with specificity set at 90% and 94%. Results Among the 4040 children, 157 were identified as having strabismus. For screening tests conducted by eye care professionals, the addition of a unilateral cover test to a test of refraction generally resulted in a statistically significant increase (range, 15%–25%) in detection of strabismus. For screening tests administered by trained lay screeners, the addition of Stereo Smile II to SureSight resulted in a statistically significant increase (21%) in sensitivity for detection of strabismus. Conclusions The most efficient and low-cost ways to achieve a statistically significant increase in sensitivity for detection of strabismus were by combining the unilateral cover test with the autorefractor (Retinomax) administered by eye care professionals and by combining Stereo Smile II with SureSight administered by trained lay screeners. The decision of whether to include a test of alignment should be based on the screening program’s goals (e.g., targeted visual conditions) and resources. PMID:17591881
Comment on Hall et al. (2017), "How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial".

PubMed

Sabour, Siamak

2018-03-08

The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.
Statistical Tests of System Linearity Based on the Method of Surrogate Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hunter, N.; Paez, T.; Red-Horse, J.

When dealing with measured data from dynamic systems we often make the tacit assumption that the data are generated by linear dynamics. While some systematic tests for linearity and determinism are available - for example the coherence fimction, the probability density fimction, and the bispectrum - fi,u-ther tests that quanti$ the existence and the degree of nonlinearity are clearly needed. In this paper we demonstrate a statistical test for the nonlinearity exhibited by a dynamic system excited by Gaussian random noise. We perform the usual division of the input and response time series data into blocks as required by themore » Welch method of spectrum estimation and search for significant relationships between a given input fkequency and response at harmonics of the selected input frequency. We argue that systematic tests based on the recently developed statistical method of surrogate data readily detect significant nonlinear relationships. The paper elucidates the method of surrogate data. Typical results are illustrated for a linear single degree-of-freedom system and for a system with polynomial stiffness nonlinearity.« less
Selecting the most appropriate inferential statistical test for your quantitative research study.

PubMed

Bettany-Saltikov, Josette; Whittaker, Victoria Jane

2014-06-01

To discuss the issues and processes relating to the selection of the most appropriate statistical test. A review of the basic research concepts together with a number of clinical scenarios is used to illustrate this. Quantitative nursing research generally features the use of empirical data which necessitates the selection of both descriptive and statistical tests. Different types of research questions can be answered by different types of research designs, which in turn need to be matched to a specific statistical test(s). Discursive paper. This paper discusses the issues relating to the selection of the most appropriate statistical test and makes some recommendations as to how these might be dealt with. When conducting empirical quantitative studies, a number of key issues need to be considered. Considerations for selecting the most appropriate statistical tests are discussed and flow charts provided to facilitate this process. When nursing clinicians and researchers conduct quantitative research studies, it is crucial that the most appropriate statistical test is selected to enable valid conclusions to be made. © 2013 John Wiley & Sons Ltd.
The influence of autocorrelation in signature extraction: An example from a geobotanical investigation of Cotter Basin, Montana

NASA Technical Reports Server (NTRS)

Labovitz, M. L.; Masuoka, E. J. (Principal Investigator)

1981-01-01

The presence of positive serial correlation (autocorrelation) in remotely sensed data results in an underestimate of the variance-covariance matrix when calculated using contiguous pixels. This underestimate produces an inflation in F statistics. For a set of Thematic Mapper Simulator data (TMS), used to test the ability to discriminate a known geobotanical anomaly from its background, the inflation in F statistics related to serial correlation is between 7 and 70 times. This means that significance tests of means of the spectral bands initially appear to suggest that the anomalous site is very different in spectral reflectance and emittance from its background sites. However, this difference often disappears and is always dramatically reduced when compared to frequency distributions of test statistics produced by the comparison of simulated training sets possessing equal means, but which are composed of autocorrelated observations.
Global Active Stretching (SGA®) Practice for Judo Practitioners’ Physical Performance Enhancement

PubMed Central

ALMEIDA, HELENO; DE SOUZA, RAPHAEL F.; AIDAR, FELIPE J.; DA SILVA, ALISSON G.; REGI, RICARDO P.; BASTOS, AFRÂNIO A.

2018-01-01

In order to analyze the Global Active Stretching (SGA®) practice on the physical performance enhancement in judo-practitioner competitors, 12 male athletes from Judo Federation of Sergipe (Federação Sergipana de Judô), were divided into two groups: Experimental Group (EG) and Control Group (CG). For 10 weeks, the EG practiced SGA® self-postures and the CG practiced assorted calisthenic exercises. All of them were submitted to a variety of tests (before and after): handgrip strength, flexibility, upper limbs’ muscle power, isometric pull-up force, lower limbs’ muscle power (squat-jump – SJ and countermovement jump – CMJ) and Tokui Waza test. Due to the small number of people in the sample, the data were considered non-parametric and then we applied the Wilcoxon test using the software R version 3.3.2 (R Development Core Team, Austria). The effect size was calculated and considered statistically significant the values p ≤ 0.05. Concerning the results, the EG statistical differences were highlighted in flexibility, upper limbs’ muscle power and lower limbs’ muscle power (CMJ), with a gain of 3.00 ± (1.09) cm, 0,42 ± (0,51) m and 2.49 ± (0.63) cm, respectively. The CG only presented statistical difference in the lower limbs’ test (CMJ), with a gain of 0,55 ± 2,28 cm. Thus, the main results pointed out statistical differences before and after in the EG in the flexibility, upper limbs and lower limbs’ muscle power (CMJ), with a gain of 3.00 ± 1.09 cm, 0.42 ± 0.51 m 2.49 ± 0.63 cm, respectively. On the other hand, the CG presented a statistical difference only the lower limbs’ CMJ test, with a gain of 0.55 ± 2.28 cm. The regular 10-week practice of SGA® self-postures increased judoka practitioners’ posterior chain flexibility and vertical jumping (CMJ) performance. PMID:29795746
Assessing Discriminative Performance at External Validation of Clinical Prediction Models

PubMed Central

Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

2016-01-01

Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients. PMID:26881753
Statistics on continuous IBD data: Exact distribution evaluation for a pair of full(half)-sibs and a pair of a (great-) grandchild with a (great-) grandparent

PubMed Central

Stefanov, Valeri T

2002-01-01

Background Pairs of related individuals are widely used in linkage analysis. Most of the tests for linkage analysis are based on statistics associated with identity by descent (IBD) data. The current biotechnology provides data on very densely packed loci, and therefore, it may provide almost continuous IBD data for pairs of closely related individuals. Therefore, the distribution theory for statistics on continuous IBD data is of interest. In particular, distributional results which allow the evaluation of p-values for relevant tests are of importance. Results A technology is provided for numerical evaluation, with any given accuracy, of the cumulative probabilities of some statistics on continuous genome data for pairs of closely related individuals. In the case of a pair of full-sibs, the following statistics are considered: (i) the proportion of genome with 2 (at least 1) haplotypes shared identical-by-descent (IBD) on a chromosomal segment, (ii) the number of distinct pieces (subsegments) of a chromosomal segment, on each of which exactly 2 (at least 1) haplotypes are shared IBD. The natural counterparts of these statistics for the other relationships are also considered. Relevant Maple codes are provided for a rapid evaluation of the cumulative probabilities of such statistics. The genomic continuum model, with Haldane's model for the crossover process, is assumed. Conclusions A technology, together with relevant software codes for its automated implementation, are provided for exact evaluation of the distributions of relevant statistics associated with continuous genome data on closely related individuals. PMID:11996673
Validation of a novel saliva-based ELISA test for diagnosing tapeworm burden in horses.

PubMed

Lightbody, Kirsty L; Davis, Paul J; Austin, Corrine J

2016-06-01

Tapeworm infections pose a significant threat to equine health as they are associated with clinical cases of colic. Diagnosis of tapeworm burden using fecal egg counts (FECs) is unreliable, and, although a commercial serologic ELISA for anti-tapeworm antibodies is available, it requires a veterinarian to collect the blood sample. A reliable diagnostic test using an owner-accessible sample such as saliva could provide a cost-effective alternative for tapeworm testing in horses, and allow targeted deworming strategies. The purpose of the study was to statistically validate a saliva tapeworm ELISA test and compare to a tapeworm-specific IgG(T) serologic ELISA. Serum samples (139) and matched saliva samples (104) were collected from horses at a UK abattoir. The ileocecal junction and cecum were visually examined for tapeworms and any present were counted. Samples were analyzed using a serologic ELISA and the saliva tapeworm test. The test results were compared to tapeworm numbers and the various data sets were statistically analyzed. Saliva scores had strong positive correlations with both infection intensity (0.74) and serologic results (Spearman's rank coefficients; 0.74 and 0.86, respectively). The saliva tapeworm test was capable of identifying the presence of one or more tapeworms with 83% sensitivity and 85% specificity. Importantly, no high-burden (more than 20 tapeworms) horses were misdiagnosed. The saliva tapeworm test has statistical accuracy for detecting tapeworm burdens in horses with 83% sensitivity and 85% specificity, similar to those of the serologic ELISA (85% and 78%, respectively). © 2016 American Society for Veterinary Clinical Pathology.
gsSKAT: Rapid gene set analysis and multiple testing correction for rare-variant association studies using weighted linear kernels.

PubMed

Larson, Nicholas B; McDonnell, Shannon; Cannon Albright, Lisa; Teerlink, Craig; Stanford, Janet; Ostrander, Elaine A; Isaacs, William B; Xu, Jianfeng; Cooney, Kathleen A; Lange, Ethan; Schleutker, Johanna; Carpten, John D; Powell, Isaac; Bailey-Wilson, Joan E; Cussenot, Olivier; Cancel-Tassin, Geraldine; Giles, Graham G; MacInnis, Robert J; Maier, Christiane; Whittemore, Alice S; Hsieh, Chih-Lin; Wiklund, Fredrik; Catalona, William J; Foulkes, William; Mandal, Diptasri; Eeles, Rosalind; Kote-Jarai, Zsofia; Ackerman, Michael J; Olson, Timothy M; Klein, Christopher J; Thibodeau, Stephen N; Schaid, Daniel J

2017-05-01

Next-generation sequencing technologies have afforded unprecedented characterization of low-frequency and rare genetic variation. Due to low power for single-variant testing, aggregative methods are commonly used to combine observed rare variation within a single gene. Causal variation may also aggregate across multiple genes within relevant biomolecular pathways. Kernel-machine regression and adaptive testing methods for aggregative rare-variant association testing have been demonstrated to be powerful approaches for pathway-level analysis, although these methods tend to be computationally intensive at high-variant dimensionality and require access to complete data. An additional analytical issue in scans of large pathway definition sets is multiple testing correction. Gene set definitions may exhibit substantial genic overlap, and the impact of the resultant correlation in test statistics on Type I error rate control for large agnostic gene set scans has not been fully explored. Herein, we first outline a statistical strategy for aggregative rare-variant analysis using component gene-level linear kernel score test summary statistics as well as derive simple estimators of the effective number of tests for family-wise error rate control. We then conduct extensive simulation studies to characterize the behavior of our approach relative to direct application of kernel and adaptive methods under a variety of conditions. We also apply our method to two case-control studies, respectively, evaluating rare variation in hereditary prostate cancer and schizophrenia. Finally, we provide open-source R code for public use to facilitate easy application of our methods to existing rare-variant analysis results. © 2017 WILEY PERIODICALS, INC.
Multiple comparison analysis testing in ANOVA.

PubMed

McHugh, Mary L

2011-01-01

The Analysis of Variance (ANOVA) test has long been an important tool for researchers conducting studies on multiple experimental groups and one or more control groups. However, ANOVA cannot provide detailed information on differences among the various study groups, or on complex combinations of study groups. To fully understand group differences in an ANOVA, researchers must conduct tests of the differences between particular pairs of experimental and control groups. Tests conducted on subsets of data tested previously in another analysis are called post hoc tests. A class of post hoc tests that provide this type of detailed information for ANOVA results are called "multiple comparison analysis" tests. The most commonly used multiple comparison analysis statistics include the following tests: Tukey, Newman-Keuls, Scheffee, Bonferroni and Dunnett. These statistical tools each have specific uses, advantages and disadvantages. Some are best used for testing theory while others are useful in generating new theory. Selection of the appropriate post hoc test will provide researchers with the most detailed information while limiting Type 1 errors due to alpha inflation.
A survey of statistics in three UK general practice journal

PubMed Central

Rigby, Alan S; Armstrong, Gillian K; Campbell, Michael J; Summerton, Nick

2004-01-01

Background Many medical specialities have reviewed the statistical content of their journals. To our knowledge this has not been done in general practice. Given the main role of a general practitioner as a diagnostician we thought it would be of interest to see whether the statistical methods reported reflect the diagnostic process. Methods Hand search of three UK journals of general practice namely the British Medical Journal (general practice section), British Journal of General Practice and Family Practice over a one-year period (1 January to 31 December 2000). Results A wide variety of statistical techniques were used. The most common methods included t-tests and Chi-squared tests. There were few articles reporting likelihood ratios and other useful diagnostic methods. There was evidence that the journals with the more thorough statistical review process reported a more complex and wider variety of statistical techniques. Conclusions The BMJ had a wider range and greater diversity of statistical methods than the other two journals. However, in all three journals there was a dearth of papers reflecting the diagnostic process. Across all three journals there were relatively few papers describing randomised controlled trials thus recognising the difficulty of implementing this design in general practice. PMID:15596014
Development of polytoxicomania in function of defence from psychoticism.

PubMed

Nenadović, Milutin M; Sapić, Rosa

2011-01-01

Polytoxicomanic proportions in subpopulations of youth have been growing steadily in recent decades, and this trend is pan-continental. Psychoticism is a psychological construct that assumes special basic dimensions of personality disintegration and cognitive functions. Psychoticism may, in general, be the basis of pathological functioning of youth and influence the patterns of thought, feelings and actions that cause dysfunction. The aim of this study was to determine the distribution of basic dimensions of psychoticism for commitment of youth to abuse psychoactive substances (PAS) in order to reduce disturbing intrapsychic experiences or manifestation of psychotic symptoms. For the purpose of this study, two groups of respondents were formed, balanced by age, gender and family structure of origin (at least one parent alive). The study applied a DELTA-9 instrument for assessment of cognitive disintegration in function of establishing psychoticism and its operationalization. The obtained results were statistically analyzed. From the parameters of descriptive statistics, the arithmetic mean was calculated with measures of dispersion. A cross-tabular analysis of variables tested was performed, as well as statistical significance with Pearson's chi2-test, and analysis of variance. Age structure and gender are approximately represented in the group of polytoximaniacs and the control group. Testing did not confirm the statistically significant difference (p > 0.5). Statistical methodology established that they significantly differed in most variables of psychoticism, polytoxicomaniacs compared with a control group of respondents. Testing confirmed a high statistical significance of differences of variables of psychoticism in the group of respondents for p < 0.001 to p < 0.01. A statistically significant representation of the dimension of psychoticism in the polytoxicomaniac group was established. The presence of factors concerning common executive dysfunction was emphasized.
The effects of academic grouping on student performance in science

NASA Astrophysics Data System (ADS)

Scoggins, Sally Smykla

The current action research study explored how student placement in heterogeneous or homogeneous classes in seventh-grade science affected students' eighth-grade Science State of Texas Assessment of Academic Readiness (STAAR) scores, and how ability grouping affected students' scores based on race and socioeconomic status. The population included all eighth-grade students in the target district who took the regular eighth-grade science STAAR over four academic school years. The researcher ran three statistical tests: a t-test for independent samples, a one-way between subjects analysis of variance (ANOVA) and a two-way between subjects ANOVA. The results showed no statistically significant difference between eighth-grade Pre-AP students from seventh-grade Pre-AP classes and eighth-grade Pre-AP students from heterogeneous seventh-grade classes and no statistically significant difference between Pre-AP students' scores based on socioeconomic status. There was no statistically significant interaction between socioeconomic status and the seventh-grade science classes. The scores between regular eighth-grade students who were in heterogeneous seventh-grade classes were statistically significantly higher than the scores of regular eighth-grade students who were in regular seventh-grade classes. The results also revealed that the scores of students who were White were statistically significantly higher than the scores of students who were Black and Hispanic. Black and Hispanic scores did not differ significantly. Further results indicated that the STAAR Level II and Level III scores were statistically significantly higher for the Pre-AP eighth-grade students who were in heterogeneous seventh-grade classes than the STAAR Level II and Level III scores of Pre-AP eighth-grade students who were in Pre-AP seventh-grade classes.
Development of computer-assisted instruction application for statistical data analysis android platform as learning resource

NASA Astrophysics Data System (ADS)

Hendikawati, P.; Arifudin, R.; Zahid, M. Z.

2018-03-01

This study aims to design an android Statistics Data Analysis application that can be accessed through mobile devices to making it easier for users to access. The Statistics Data Analysis application includes various topics of basic statistical along with a parametric statistics data analysis application. The output of this application system is parametric statistics data analysis that can be used for students, lecturers, and users who need the results of statistical calculations quickly and easily understood. Android application development is created using Java programming language. The server programming language uses PHP with the Code Igniter framework, and the database used MySQL. The system development methodology used is the Waterfall methodology with the stages of analysis, design, coding, testing, and implementation and system maintenance. This statistical data analysis application is expected to support statistical lecturing activities and make students easier to understand the statistical analysis of mobile devices.
The MAX Statistic is Less Powerful for Genome Wide Association Studies Under Most Alternative Hypotheses.

PubMed

Shifflett, Benjamin; Huang, Rong; Edland, Steven D

2017-01-01

Genotypic association studies are prone to inflated type I error rates if multiple hypothesis testing is performed, e.g., sequentially testing for recessive, multiplicative, and dominant risk. Alternatives to multiple hypothesis testing include the model independent genotypic χ 2 test, the efficiency robust MAX statistic, which corrects for multiple comparisons but with some loss of power, or a single Armitage test for multiplicative trend, which has optimal power when the multiplicative model holds but with some loss of power when dominant or recessive models underlie the genetic association. We used Monte Carlo simulations to describe the relative performance of these three approaches under a range of scenarios. All three approaches maintained their nominal type I error rates. The genotypic χ 2 and MAX statistics were more powerful when testing a strictly recessive genetic effect or when testing a dominant effect when the allele frequency was high. The Armitage test for multiplicative trend was most powerful for the broad range of scenarios where heterozygote risk is intermediate between recessive and dominant risk. Moreover, all tests had limited power to detect recessive genetic risk unless the sample size was large, and conversely all tests were relatively well powered to detect dominant risk. Taken together, these results suggest the general utility of the multiplicative trend test when the underlying genetic model is unknown.
Interpreting “statistical hypothesis testing” results in clinical research

PubMed Central

Sarmukaddam, Sanjeev B.

2012-01-01

Difference between “Clinical Significance and Statistical Significance” should be kept in mind while interpreting “statistical hypothesis testing” results in clinical research. This fact is already known to many but again pointed out here as philosophy of “statistical hypothesis testing” is sometimes unnecessarily criticized mainly due to failure in considering such distinction. Randomized controlled trials are also wrongly criticized similarly. Some scientific method may not be applicable in some peculiar/particular situation does not mean that the method is useless. Also remember that “statistical hypothesis testing” is not for decision making and the field of “decision analysis” is very much an integral part of science of statistics. It is not correct to say that “confidence intervals have nothing to do with confidence” unless one understands meaning of the word “confidence” as used in context of confidence interval. Interpretation of the results of every study should always consider all possible alternative explanations like chance, bias, and confounding. Statistical tests in inferential statistics are, in general, designed to answer the question “How likely is the difference found in random sample(s) is due to chance” and therefore limitation of relying only on statistical significance in making clinical decisions should be avoided. PMID:22707861
[The growth behavior of mouse fibroblasts on intraocular lens surface of various silicone and PMMA materials].

PubMed

Kammann, J; Kreiner, C F; Kaden, P

1994-08-01

Experience with intraocular lenses (IOL) made of PMMA dates back ca. 40 years, while silicone IOLs have been in use for only about 10 years. The biocompatibility of PMMA and silicone caoutchouc was tested in a comparative study investigating the growth of mouse fibroblasts on different IOL materials. Spectrophotometric determination of protein synthesis and liquid scintillation counting of DNA synthesis were carried out. The spreading of cells was planimetrically determined, and the DNA synthesis of individual cells in direct contact with the test sample was tested. The results showed that the biocompatibility of silicone lenses made of purified caoutchouc is comparable with that of PMMA lenses; there is no statistically significant difference. However, impurities arising during material synthesis result in a statistically significant inhibition of cell growth on the IOL surfaces.
Testing Transitivity of Preferences on Two-Alternative Forced Choice Data

PubMed Central

Regenwetter, Michel; Dana, Jason; Davis-Stober, Clintin P.

2010-01-01

As Duncan Luce and other prominent scholars have pointed out on several occasions, testing algebraic models against empirical data raises difficult conceptual, mathematical, and statistical challenges. Empirical data often result from statistical sampling processes, whereas algebraic theories are nonprobabilistic. Many probabilistic specifications lead to statistical boundary problems and are subject to nontrivial order constrained statistical inference. The present paper discusses Luce's challenge for a particularly prominent axiom: Transitivity. The axiom of transitivity is a central component in many algebraic theories of preference and choice. We offer the currently most complete solution to the challenge in the case of transitivity of binary preference on the theory side and two-alternative forced choice on the empirical side, explicitly for up to five, and implicitly for up to seven, choice alternatives. We also discuss the relationship between our proposed solution and weak stochastic transitivity. We recommend to abandon the latter as a model of transitive individual preferences. PMID:21833217

Melody and pitch processing in five musical savants with congenital blindness.

PubMed

Pring, Linda; Woolf, Katherine; Tadic, Valerie

2008-01-01

We examined absolute-pitch (AP) and short-term musical memory abilities of five musical savants with congenital blindness, seven musicians, and seven non-musicians with good vision and normal intelligence in two experiments. In the first, short-term memory for musical phrases was tested and the savants and musicians performed statistically indistinguishably, both significantly outperforming the non-musicians and remembering more material from the C major scale sequences than random trials. In the second experiment, participants learnt associations between four pitches and four objects using a non-verbal paradigm. This experiment approximates to testing AP ability. Low statistical power meant the savants were not statistically better than the musicians, although only the savants scored statistically higher than the non-musicians. The results are evidence for a musical module, separate from general intelligence; they also support the anecdotal reporting of AP in musical savants, which is thought to be necessary for the development of musical-savant skill.
Decadal power in land air temperatures: Is it statistically significant?

NASA Astrophysics Data System (ADS)

Thejll, Peter A.

2001-12-01

The geographical distribution and properties of the well-known 10-11 year signal in terrestrial temperature records is investigated. By analyzing the Global Historical Climate Network data for surface air temperatures we verify that the signal is strongest in North America and is similar in nature to that reported earlier by R. G. Currie. The decadal signal is statistically significant for individual stations, but it is not possible to show that the signal is statistically significant globally, using strict tests. In North America, during the twentieth century, the decadal variability in the solar activity cycle is associated with the decadal part of the North Atlantic Oscillation index series in such a way that both of these signals correspond to the same spatial pattern of cooling and warming. A method for testing statistical results with Monte Carlo trials on data fields with specified temporal structure and specific spatial correlation retained is presented.
Précis of statistical significance: rationale, validity, and utility.

PubMed

Chow, S L

1998-04-01

The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H0, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics.
Learning during a Collaborative Final Exam

ERIC Educational Resources Information Center

Dahlstrom, Orjan

2012-01-01

Collaborative testing has been suggested to serve as a good learning activity, for example, compared to individual testing. The aim of the present study was to measure learning at different levels of knowledge during a collaborative final exam in a course in basic methods and statistical procedures. Results on pre- and post-tests taken…
Outlier Detection in High-Stakes Certification Testing.

ERIC Educational Resources Information Center

Meijer, Rob R.

2002-01-01

Used empirical data from a certification test to study methods from statistical process control that have been proposed to classify an item score pattern as fitting or misfitting the underlying item response theory model in computerized adaptive testing. Results for 1,392 examinees show that different types of misfit can be distinguished. (SLD)
Effects of Irrigating Tree Seedlings with a Nutrient Solution

Treesearch

R. P. Belanger; C. B. Briscoe

1963-01-01

Subsurface irrigation with nutrient solution was found to be biologically feasible under the conditions tested. Growth of seedlings was satisfactory, but not unusually good. On the bases of total height growth, and growth in fresh weight, the various fertilizers tested produced statistically different results. The species tested, members of three different families and...
Spatial Thinking Ability Assessment in Rwandan Secondary Schools: Baseline Results

ERIC Educational Resources Information Center

Tomaszewski, Brian; Vodacek, Anthony; Parody, Robert; Holt, Nicholas

2015-01-01

This article discusses use and modification of Lee and Bednarz's (2012) Spatial Thinking Ability Test (STAT) as a spatial thinking assessment device in Rwandan secondary schools. After piloting and modifying the STAT, 222 students total from our rural and urban test schools and one control school were tested. Statistical analysis revealed that…
Statistical tests to compare motif count exceptionalities

PubMed Central

Robin, Stéphane; Schbath, Sophie; Vandewalle, Vincent

2007-01-01

Background Finding over- or under-represented motifs in biological sequences is now a common task in genomics. Thanks to p-value calculation for motif counts, exceptional motifs are identified and represent candidate functional motifs. The present work addresses the related question of comparing the exceptionality of one motif in two different sequences. Just comparing the motif count p-values in each sequence is indeed not sufficient to decide if this motif is significantly more exceptional in one sequence compared to the other one. A statistical test is required. Results We develop and analyze two statistical tests, an exact binomial one and an asymptotic likelihood ratio test, to decide whether the exceptionality of a given motif is equivalent or significantly different in two sequences of interest. For that purpose, motif occurrences are modeled by Poisson processes, with a special care for overlapping motifs. Both tests can take the sequence compositions into account. As an illustration, we compare the octamer exceptionalities in the Escherichia coli K-12 backbone versus variable strain-specific loops. Conclusion The exact binomial test is particularly adapted for small counts. For large counts, we advise to use the likelihood ratio test which is asymptotic but strongly correlated with the exact binomial test and very simple to use. PMID:17346349
Analysis of Statistical Methods and Errors in the Articles Published in the Korean Journal of Pain

PubMed Central

Yim, Kyoung Hoon; Han, Kyoung Ah; Park, Soo Young

2010-01-01

Background Statistical analysis is essential in regard to obtaining objective reliability for medical research. However, medical researchers do not have enough statistical knowledge to properly analyze their study data. To help understand and potentially alleviate this problem, we have analyzed the statistical methods and errors of articles published in the Korean Journal of Pain (KJP), with the intention to improve the statistical quality of the journal. Methods All the articles, except case reports and editorials, published from 2004 to 2008 in the KJP were reviewed. The types of applied statistical methods and errors in the articles were evaluated. Results One hundred and thirty-nine original articles were reviewed. Inferential statistics and descriptive statistics were used in 119 papers and 20 papers, respectively. Only 20.9% of the papers were free from statistical errors. The most commonly adopted statistical method was the t-test (21.0%) followed by the chi-square test (15.9%). Errors of omission were encountered 101 times in 70 papers. Among the errors of omission, "no statistics used even though statistical methods were required" was the most common (40.6%). The errors of commission were encountered 165 times in 86 papers, among which "parametric inference for nonparametric data" was the most common (33.9%). Conclusions We found various types of statistical errors in the articles published in the KJP. This suggests that meticulous attention should be given not only in the applying statistical procedures but also in the reviewing process to improve the value of the article. PMID:20552071
Statistical auditing and randomness test of lotto k/N-type games

NASA Astrophysics Data System (ADS)

Coronel-Brizio, H. F.; Hernández-Montoya, A. R.; Rapallo, F.; Scalas, E.

2008-11-01

One of the most popular lottery games worldwide is the so-called “lotto k/N”. It considers N numbers 1,2,…,N from which k are drawn randomly, without replacement. A player selects k or more numbers and the first prize is shared amongst those players whose selected numbers match all of the k randomly drawn. Exact rules may vary in different countries. In this paper, mean values and covariances for the random variables representing the numbers drawn from this kind of game are presented, with the aim of using them to audit statistically the consistency of a given sample of historical results with theoretical values coming from a hypergeometric statistical model. The method can be adapted to test pseudorandom number generators.
Evaluation of gloss changes of two denture acrylic resin materials in four different beverages.

PubMed

Keyf, Filiz; Etikan, Ilker

2004-03-01

The primary disadvantages of the materials which are used in construction of complete and removable partial dentures is that their esthetic, physical and mechanical properties change rapidly with time in the oral environment. For esthetics, color stability is one of the criteria that needs careful attention. Color may provide important information on the serviceability of these materials. Color change affects the gloss of these materials. The objective of the present study was to determine the gloss changes resulting from the testing process in four different beverages in one heat-polymerized denture base resin and one cold-polymerized denture base repair resin. Thirty-six samples were fabricated for each material. Each sample had a smooth polished and a rough unpolished surface. The gloss measurements were made with a glossmeter before testing. Four different beverages (tea, coffee, cola and cherry juice) were used for testing. Two angles of illumination (20 and 60 degrees) were used for the gloss measurements. The samples were immersed in water, tea, coffee, cola and cherry juice solutions. The gloss of the samples was measured again with the glossmeter at the end of the 45th day and 135th day of testing. The arithmetic mean and standard deviation of each of the samples were calculated and compared with each other statistically by using the Wilcoxon test (within times) (p < or = 0.05 significant), the Kruskal-Wallis analysis of variance (p < or = 0.05 significant) and the Mann-Whitney U-test with Bonforoni correction (when the difference between the samples was significant) (p < or = 0.05 significant). The results of this study revealed that gloss changes occurred after testing in heat-polymerized denture base resin and cold-polymerized denture base repair resin. The significance of the gloss changes exhibited by each sample, kept for different lengths of time in the same solution, were compared using the Wilcoxon test. The results were statistically significant (p < or = 0.05). According to the Kruskal-Wallis analysis of variance, the difference between measurements for angles of illumination was statistically significant (p < or = 0.05). Also according to the Mann-Whitney U-test, the difference between two polished surfaces or two unpolished surfaces was statistically insignificant (p > 0.05), but the difference between smooth polished and rough unpolished surfaces was statistically significant (p < or = 0.05). It was found that either the gloss of heat-polymerized denture base resin or the gloss of cold-polymerized denture base repair resin was affected by tested agents, and the four beverages demonstrated noticeable gloss changes. Cherry juice demonstrated the least change, while tea exhibited the greatest change.
Influence of manufacturing parameters on the strength of PLA parts using Layered Manufacturing technique: A statistical approach

NASA Astrophysics Data System (ADS)

Jaya Christiyan, K. G.; Chandrasekhar, U.; Mathivanan, N. Rajesh; Venkateswarlu, K.

2018-02-01

A 3D printing was successfully used to fabricate samples of Polylactic Acid (PLA). Processing parameters such as Lay-up speed, Lay-up thickness, and printing nozzle were varied. All samples were tested for flexural strength using three point load test. A statistical mathematical model was developed to correlate the processing parameters with flexural strength. The result clearly demonstrated that the lay-up thickness and nozzle diameter influenced flexural strength significantly, whereas lay-up speed hardly influenced the flexural strength.
Differences between Lab Completion and Non-Completion on Student Performance in an Online Undergraduate Environmental Science Program

NASA Astrophysics Data System (ADS)

Corsi, Gianluca

2011-12-01

Web-based technology has revolutionized the way education is delivered. Although the advantages of online learning appeal to large numbers of students, some concerns arise. One major concern in online science education is the value that participation in labs has on student performance. The purpose of this study was to assess the relationships between lab completion and student academic success as measured by test grades, scientific self-confidence, scientific skills, and concept mastery. A random sample of 114 volunteer undergraduate students, from an online Environmental Science program at the American Public University System, was tested. The study followed a quantitative, non-experimental research design. Paired sample t-tests were used for statistical comparison between pre-lab and post-lab test grades, two scientific skills quizzes, and two scientific self-confidence surveys administered at the beginning and at the end of the course. The results of the paired sample t-tests revealed statistically significant improvements on all post-lab test scores: Air Pollution lab, t(112) = 6.759, p < .001; Home Chemicals lab t(114) = 8.585, p < .001; Water Use lab, t(116) = 6.657, p < .001; Trees and Carbon lab, t(113) = 9.921, p < .001; Stratospheric Ozone lab, t(112) =12.974, p < .001; Renewable Energy lab, t(115) = 7.369, p < .001. The end of the course Scientific Skills quiz revealed statistically significant improvements, t(112) = 8.221, p < .001. The results of the two surveys showed a statistically significant improvement on student Scientific Self-Confidence because of lab completion, t(114) = 3.015, p < .05. Because age and gender were available, regression models were developed. The results indicated weak multiple correlation coefficients and were not statistically significant at alpha = .05. Evidence suggests that labs play a positive role in a student's academic success. It is recommended that lab experiences be included in all online Environmental Science programs, with emphasis on open-ended inquiries, and adoption of online tools to enhance hands-on experiences, such as virtual reality platforms and digital animations. Future research is encouraged to investigate possible correlations between socio-demographic attributes and academic success of students enrolled in online science programs in reference to lab completion.
Low power and type II errors in recent ophthalmology research.

PubMed

Khan, Zainab; Milko, Jordan; Iqbal, Munir; Masri, Moness; Almeida, David R P

2016-10-01

To investigate the power of unpaired t tests in prospective, randomized controlled trials when these tests failed to detect a statistically significant difference and to determine the frequency of type II errors. Systematic review and meta-analysis. We examined all prospective, randomized controlled trials published between 2010 and 2012 in 4 major ophthalmology journals (Archives of Ophthalmology, British Journal of Ophthalmology, Ophthalmology, and American Journal of Ophthalmology). Studies that used unpaired t tests were included. Power was calculated using the number of subjects in each group, standard deviations, and α = 0.05. The difference between control and experimental means was set to be (1) 20% and (2) 50% of the absolute value of the control's initial conditions. Power and Precision version 4.0 software was used to carry out calculations. Finally, the proportion of articles with type II errors was calculated. β = 0.3 was set as the largest acceptable value for the probability of type II errors. In total, 280 articles were screened. Final analysis included 50 prospective, randomized controlled trials using unpaired t tests. The median power of tests to detect a 50% difference between means was 0.9 and was the same for all 4 journals regardless of the statistical significance of the test. The median power of tests to detect a 20% difference between means ranged from 0.26 to 0.9 for the 4 journals. The median power of these tests to detect a 50% and 20% difference between means was 0.9 and 0.5 for tests that did not achieve statistical significance. A total of 14% and 57% of articles with negative unpaired t tests contained results with β > 0.3 when power was calculated for differences between means of 50% and 20%, respectively. A large portion of studies demonstrate high probabilities of type II errors when detecting small differences between means. The power to detect small difference between means varies across journals. It is, therefore, worthwhile for authors to mention the minimum clinically important difference for individual studies. Journals can consider publishing statistical guidelines for authors to use. Day-to-day clinical decisions rely heavily on the evidence base formed by the plethora of studies available to clinicians. Prospective, randomized controlled clinical trials are highly regarded as a robust study and are used to make important clinical decisions that directly affect patient care. The quality of study designs and statistical methods in major clinical journals is improving overtime, 1 and researchers and journals are being more attentive to statistical methodologies incorporated by studies. The results of well-designed ophthalmic studies with robust methodologies, therefore, have the ability to modify the ways in which diseases are managed. Copyright © 2016 Canadian Ophthalmological Society. Published by Elsevier Inc. All rights reserved.
Extending local canonical correlation analysis to handle general linear contrasts for FMRI data.

PubMed

Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

2012-01-01

Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic.
Extending Local Canonical Correlation Analysis to Handle General Linear Contrasts for fMRI Data

PubMed Central

Jin, Mingwu; Nandy, Rajesh; Curran, Tim; Cordes, Dietmar

2012-01-01

Local canonical correlation analysis (CCA) is a multivariate method that has been proposed to more accurately determine activation patterns in fMRI data. In its conventional formulation, CCA has several drawbacks that limit its usefulness in fMRI. A major drawback is that, unlike the general linear model (GLM), a test of general linear contrasts of the temporal regressors has not been incorporated into the CCA formalism. To overcome this drawback, a novel directional test statistic was derived using the equivalence of multivariate multiple regression (MVMR) and CCA. This extension will allow CCA to be used for inference of general linear contrasts in more complicated fMRI designs without reparameterization of the design matrix and without reestimating the CCA solutions for each particular contrast of interest. With the proper constraints on the spatial coefficients of CCA, this test statistic can yield a more powerful test on the inference of evoked brain regional activations from noisy fMRI data than the conventional t-test in the GLM. The quantitative results from simulated and pseudoreal data and activation maps from fMRI data were used to demonstrate the advantage of this novel test statistic. PMID:22461786
Assessment of statistical significance and clinical relevance.

PubMed

Kieser, Meinhard; Friede, Tim; Gondan, Matthias

2013-05-10

In drug development, it is well accepted that a successful study will demonstrate not only a statistically significant result but also a clinically relevant effect size. Whereas standard hypothesis tests are used to demonstrate the former, it is less clear how the latter should be established. In the first part of this paper, we consider the responder analysis approach and study the performance of locally optimal rank tests when the outcome distribution is a mixture of responder and non-responder distributions. We find that these tests are quite sensitive to their planning assumptions and have therefore not really any advantage over standard tests such as the t-test and the Wilcoxon-Mann-Whitney test, which perform overall well and can be recommended for applications. In the second part, we present a new approach to the assessment of clinical relevance based on the so-called relative effect (or probabilistic index) and derive appropriate sample size formulae for the design of studies aiming at demonstrating both a statistically significant and clinically relevant effect. Referring to recent studies in multiple sclerosis, we discuss potential issues in the application of this approach. Copyright © 2012 John Wiley & Sons, Ltd.
Staging Liver Fibrosis with Statistical Observers

NASA Astrophysics Data System (ADS)

Brand, Jonathan Frieman

Chronic liver disease is a worldwide health problem, and hepatic fibrosis (HF) is one of the hallmarks of the disease. Pathology diagnosis of HF is based on textural change in the liver as a lobular collagen network that develops within portal triads. The scale of collagen lobules is characteristically on order of 1mm, which close to the resolution limit of in vivo Gd-enhanced MRI. In this work the methods to collect training and testing images for a Hotelling observer are covered. An observer based on local texture analysis is trained and tested using wet-tissue phantoms. The technique is used to optimize the MRI sequence based on task performance. The final method developed is a two stage model observer to classify fibrotic and healthy tissue in both phantoms and in vivo MRI images. The first stage observer tests for the presence of local texture. Test statistics from the first observer are used to train the second stage observer to globally sample the local observer results. A decision of the disease class is made for an entire MRI image slice using test statistics collected from the second observer. The techniques are tested on wet-tissue phantoms and in vivo clinical patient data.
Quality Assurance for Rapid Airfield Construction

DTIC Science & Technology

2008-05-01

necessary to conduct a volume-replacement density test for in-place soil. This density test, which was developed during this investigation, involves...the test both simpler and quicker. The Clegg hammer results are the primary means of judging compaction; thus, the requirements for density tests are...minimized through a stepwise acceptance procedure. Statistical criteria for evaluating Clegg hammer and density measurements are also included
Estimating the proportion of true null hypotheses when the statistics are discrete.

PubMed

Dialsingh, Isaac; Austin, Stefanie R; Altman, Naomi S

2015-07-15

In high-dimensional testing problems π0, the proportion of null hypotheses that are true is an important parameter. For discrete test statistics, the P values come from a discrete distribution with finite support and the null distribution may depend on an ancillary statistic such as a table margin that varies among the test statistics. Methods for estimating π0 developed for continuous test statistics, which depend on a uniform or identical null distribution of P values, may not perform well when applied to discrete testing problems. This article introduces a number of π0 estimators, the regression and 'T' methods that perform well with discrete test statistics and also assesses how well methods developed for or adapted from continuous tests perform with discrete tests. We demonstrate the usefulness of these estimators in the analysis of high-throughput biological RNA-seq and single-nucleotide polymorphism data. implemented in R. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Optimizing the design of a reproduction toxicity test with the pond snail Lymnaea stagnalis.

PubMed

Charles, Sandrine; Ducrot, Virginie; Azam, Didier; Benstead, Rachel; Brettschneider, Denise; De Schamphelaere, Karel; Filipe Goncalves, Sandra; Green, John W; Holbech, Henrik; Hutchinson, Thomas H; Faber, Daniel; Laranjeiro, Filipe; Matthiessen, Peter; Norrgren, Leif; Oehlmann, Jörg; Reategui-Zirena, Evelyn; Seeland-Fremer, Anne; Teigeler, Matthias; Thome, Jean-Pierre; Tobor Kaplon, Marysia; Weltje, Lennart; Lagadic, Laurent

2016-11-01

This paper presents the results from two ring-tests addressing the feasibility, robustness and reproducibility of a reproduction toxicity test with the freshwater gastropod Lymnaea stagnalis (RENILYS strain). Sixteen laboratories (from inexperienced to expert laboratories in mollusc testing) from nine countries participated in these ring-tests. Survival and reproduction were evaluated in L. stagnalis exposed to cadmium, tributyltin, prochloraz and trenbolone according to an OECD draft Test Guideline. In total, 49 datasets were analysed to assess the practicability of the proposed experimental protocol, and to estimate the between-laboratory reproducibility of toxicity endpoint values. The statistical analysis of count data (number of clutches or eggs per individual-day) leading to ECx estimation was specifically developed and automated through a free web-interface. Based on a complementary statistical analysis, the optimal test duration was established and the most sensitive and cost-effective reproduction toxicity endpoint was identified, to be used as the core endpoint. This validation process and the resulting optimized protocol were used to consolidate the OECD Test Guideline for the evaluation of reproductive effects of chemicals in L. stagnalis. Copyright Â© 2016 Elsevier Inc. All rights reserved.
Statistical analysis and digital processing of the Mössbauer spectra

NASA Astrophysics Data System (ADS)

Prochazka, Roman; Tucek, Pavel; Tucek, Jiri; Marek, Jaroslav; Mashlan, Miroslav; Pechousek, Jiri

2010-02-01

This work is focused on using the statistical methods and development of the filtration procedures for signal processing in Mössbauer spectroscopy. Statistical tools for noise filtering in the measured spectra are used in many scientific areas. The use of a pure statistical approach in accumulated Mössbauer spectra filtration is described. In Mössbauer spectroscopy, the noise can be considered as a Poisson statistical process with a Gaussian distribution for high numbers of observations. This noise is a superposition of the non-resonant photons counting with electronic noise (from γ-ray detection and discrimination units), and the velocity system quality that can be characterized by the velocity nonlinearities. The possibility of a noise-reducing process using a new design of statistical filter procedure is described. This mathematical procedure improves the signal-to-noise ratio and thus makes it easier to determine the hyperfine parameters of the given Mössbauer spectra. The filter procedure is based on a periodogram method that makes it possible to assign the statistically important components in the spectral domain. The significance level for these components is then feedback-controlled using the correlation coefficient test results. The estimation of the theoretical correlation coefficient level which corresponds to the spectrum resolution is performed. Correlation coefficient test is based on comparison of the theoretical and the experimental correlation coefficients given by the Spearman method. The correctness of this solution was analyzed by a series of statistical tests and confirmed by many spectra measured with increasing statistical quality for a given sample (absorber). The effect of this filter procedure depends on the signal-to-noise ratio and the applicability of this method has binding conditions.
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

PubMed Central

Alves, Gelio; Ogurtsov, Aleksey Y; Yu, Yi-Kuo

2007-01-01

Background The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Results Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. PMID:17961253
Statistical Stationarity of Sediment Interbed Thicknesses in a Basalt Aquifer, Idaho National Laboratory, Eastern Snake River Plain, Idaho

USGS Publications Warehouse

Stroup, Caleb N.; Welhan, John A.; Davis, Linda C.

2008-01-01

The statistical stationarity of distributions of sedimentary interbed thicknesses within the southwestern part of the Idaho National Laboratory (INL) was evaluated within the stratigraphic framework of Quaternary sediments and basalts at the INL site, eastern Snake River Plain, Idaho. The thicknesses of 122 sedimentary interbeds observed in 11 coreholes were documented from lithologic logs and independently inferred from natural-gamma logs. Lithologic information was grouped into composite time-stratigraphic units based on correlations with existing composite-unit stratigraphy near these holes. The assignment of lithologic units to an existing chronostratigraphy on the basis of nearby composite stratigraphic units may introduce error where correlations with nearby holes are ambiguous or the distance between holes is great, but we consider this the best technique for grouping stratigraphic information in this geologic environment at this time. Nonparametric tests of similarity were used to evaluate temporal and spatial stationarity in the distributions of sediment thickness. The following statistical tests were applied to the data: (1) the Kolmogorov-Smirnov (K-S) two-sample test to compare distribution shape, (2) the Mann-Whitney (M-W) test for similarity of two medians, (3) the Kruskal-Wallis (K-W) test for similarity of multiple medians, and (4) Levene's (L) test for the similarity of two variances. Results of these analyses corroborate previous work that concluded the thickness distributions of Quaternary sedimentary interbeds are locally stationary in space and time. The data set used in this study was relatively small, so the results presented should be considered preliminary, pending incorporation of data from more coreholes. Statistical tests also demonstrated that natural-gamma logs consistently fail to detect interbeds less than about 2-3 ft thick, although these interbeds are observable in lithologic logs. This should be taken into consideration when modeling aquifer lithology or hydraulic properties based on lithology.
Clinical relevance vs. statistical significance: Using neck outcomes in patients with temporomandibular disorders as an example.

PubMed

Armijo-Olivo, Susan; Warren, Sharon; Fuentes, Jorge; Magee, David J

2011-12-01

Statistical significance has been used extensively to evaluate the results of research studies. Nevertheless, it offers only limited information to clinicians. The assessment of clinical relevance can facilitate the interpretation of the research results into clinical practice. The objective of this study was to explore different methods to evaluate the clinical relevance of the results using a cross-sectional study as an example comparing different neck outcomes between subjects with temporomandibular disorders and healthy controls. Subjects were compared for head and cervical posture, maximal cervical muscle strength, endurance of the cervical flexor and extensor muscles, and electromyographic activity of the cervical flexor muscles during the CranioCervical Flexion Test (CCFT). The evaluation of clinical relevance of the results was performed based on the effect size (ES), minimal important difference (MID), and clinical judgement. The results of this study show that it is possible to have statistical significance without having clinical relevance, to have both statistical significance and clinical relevance, to have clinical relevance without having statistical significance, or to have neither statistical significance nor clinical relevance. The evaluation of clinical relevance in clinical research is crucial to simplify the transfer of knowledge from research into practice. Clinical researchers should present the clinical relevance of their results. Copyright © 2011 Elsevier Ltd. All rights reserved.
Ensuring Positiveness of the Scaled Difference Chi-Square Test Statistic

ERIC Educational Resources Information Center

Satorra, Albert; Bentler, Peter M.

2010-01-01

A scaled difference test statistic T[tilde][subscript d] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (Psychometrika 66:507-514, 2001). The statistic T[tilde][subscript d] is asymptotically equivalent to the scaled difference test statistic T[bar][subscript…
Nursing students' attitudes toward statistics: Effect of a biostatistics course and association with examination performance.

PubMed

Kiekkas, Panagiotis; Panagiotarou, Aliki; Malja, Alvaro; Tahirai, Daniela; Zykai, Rountina; Bakalis, Nick; Stefanopoulos, Nikolaos

2015-12-01

Although statistical knowledge and skills are necessary for promoting evidence-based practice, health sciences students have expressed anxiety about statistics courses, which may hinder their learning of statistical concepts. To evaluate the effects of a biostatistics course on nursing students' attitudes toward statistics and to explore the association between these attitudes and their performance in the course examination. One-group quasi-experimental pre-test/post-test design. Undergraduate nursing students of the fifth or higher semester of studies, who attended a biostatistics course. Participants were asked to complete the pre-test and post-test forms of The Survey of Attitudes Toward Statistics (SATS)-36 scale at the beginning and end of the course respectively. Pre-test and post-test scale scores were compared, while correlations between post-test scores and participants' examination performance were estimated. Among 156 participants, post-test scores of the overall SATS-36 scale and of the Affect, Cognitive Competence, Interest and Effort components were significantly higher than pre-test ones, indicating that the course was followed by more positive attitudes toward statistics. Among 104 students who participated in the examination, higher post-test scores of the overall SATS-36 scale and of the Affect, Difficulty, Interest and Effort components were significantly but weakly correlated with higher examination performance. Students' attitudes toward statistics can be improved through appropriate biostatistics courses, while positive attitudes contribute to higher course achievements and possibly to improved statistical skills in later professional life. Copyright © 2015 Elsevier Ltd. All rights reserved.
Design of a factorial experiment with randomization restrictions to assess medical device performance on vascular tissue.

PubMed

Diestelkamp, Wiebke S; Krane, Carissa M; Pinnell, Margaret F

2011-05-20

Energy-based surgical scalpels are designed to efficiently transect and seal blood vessels using thermal energy to promote protein denaturation and coagulation. Assessment and design improvement of ultrasonic scalpel performance relies on both in vivo and ex vivo testing. The objective of this work was to design and implement a robust, experimental test matrix with randomization restrictions and predictive statistical power, which allowed for identification of those experimental variables that may affect the quality of the seal obtained ex vivo. The design of the experiment included three factors: temperature (two levels); the type of solution used to perfuse the artery during transection (three types); and artery type (two types) resulting in a total of twelve possible treatment combinations. Burst pressures of porcine carotid and renal arteries sealed ex vivo were assigned as the response variable. The experimental test matrix was designed and carried out as a split-plot experiment in order to assess the contributions of several variables and their interactions while accounting for randomization restrictions present in the experimental setup. The statistical software package SAS was utilized and PROC MIXED was used to account for the randomization restrictions in the split-plot design. The combination of temperature, solution, and vessel type had a statistically significant impact on seal quality. The design and implementation of a split-plot experimental test-matrix provided a mechanism for addressing the existing technical randomization restrictions of ex vivo ultrasonic scalpel performance testing, while preserving the ability to examine the potential effects of independent factors or variables. This method for generating the experimental design and the statistical analyses of the resulting data are adaptable to a wide variety of experimental problems involving large-scale tissue-based studies of medical or experimental device efficacy and performance.
How conservative is Fisher's exact test? A quantitative evaluation of the two-sample comparative binomial trial.

PubMed

Crans, Gerald G; Shuster, Jonathan J

2008-08-15

The debate as to which statistical methodology is most appropriate for the analysis of the two-sample comparative binomial trial has persisted for decades. Practitioners who favor the conditional methods of Fisher, Fisher's exact test (FET), claim that only experimental outcomes containing the same amount of information should be considered when performing analyses. Hence, the total number of successes should be fixed at its observed level in hypothetical repetitions of the experiment. Using conditional methods in clinical settings can pose interpretation difficulties, since results are derived using conditional sample spaces rather than the set of all possible outcomes. Perhaps more importantly from a clinical trial design perspective, this test can be too conservative, resulting in greater resource requirements and more subjects exposed to an experimental treatment. The actual significance level attained by FET (the size of the test) has not been reported in the statistical literature. Berger (J. R. Statist. Soc. D (The Statistician) 2001; 50:79-85) proposed assessing the conservativeness of conditional methods using p-value confidence intervals. In this paper we develop a numerical algorithm that calculates the size of FET for sample sizes, n, up to 125 per group at the two-sided significance level, alpha = 0.05. Additionally, this numerical method is used to define new significance levels alpha(*) = alpha+epsilon, where epsilon is a small positive number, for each n, such that the size of the test is as close as possible to the pre-specified alpha (0.05 for the current work) without exceeding it. Lastly, a sample size and power calculation example are presented, which demonstrates the statistical advantages of implementing the adjustment to FET (using alpha(*) instead of alpha) in the two-sample comparative binomial trial. 2008 John Wiley & Sons, Ltd
Design of a factorial experiment with randomization restrictions to assess medical device performance on vascular tissue

PubMed Central

2011-01-01

Background Energy-based surgical scalpels are designed to efficiently transect and seal blood vessels using thermal energy to promote protein denaturation and coagulation. Assessment and design improvement of ultrasonic scalpel performance relies on both in vivo and ex vivo testing. The objective of this work was to design and implement a robust, experimental test matrix with randomization restrictions and predictive statistical power, which allowed for identification of those experimental variables that may affect the quality of the seal obtained ex vivo. Methods The design of the experiment included three factors: temperature (two levels); the type of solution used to perfuse the artery during transection (three types); and artery type (two types) resulting in a total of twelve possible treatment combinations. Burst pressures of porcine carotid and renal arteries sealed ex vivo were assigned as the response variable. Results The experimental test matrix was designed and carried out as a split-plot experiment in order to assess the contributions of several variables and their interactions while accounting for randomization restrictions present in the experimental setup. The statistical software package SAS was utilized and PROC MIXED was used to account for the randomization restrictions in the split-plot design. The combination of temperature, solution, and vessel type had a statistically significant impact on seal quality. Conclusions The design and implementation of a split-plot experimental test-matrix provided a mechanism for addressing the existing technical randomization restrictions of ex vivo ultrasonic scalpel performance testing, while preserving the ability to examine the potential effects of independent factors or variables. This method for generating the experimental design and the statistical analyses of the resulting data are adaptable to a wide variety of experimental problems involving large-scale tissue-based studies of medical or experimental device efficacy and performance. PMID:21599963
Statistical analysis of solid waste composition data: Arithmetic mean, standard deviation and correlation coefficients.

PubMed

Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard

2017-11-01

Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Ambiguities and conflicting results: the limitations of the kappa statistic in establishing the interrater reliability of the Irish nursing minimum data set for mental health: a discussion paper.

PubMed

Morris, Roisin; MacNeela, Padraig; Scott, Anne; Treacy, Pearl; Hyde, Abbey; O'Brien, Julian; Lehwaldt, Daniella; Byrne, Anne; Drennan, Jonathan

2008-04-01

In a study to establish the interrater reliability of the Irish Nursing Minimum Data Set (I-NMDS) for mental health difficulties relating to the choice of reliability test statistic were encountered. The objective of this paper is to highlight the difficulties associated with testing interrater reliability for an ordinal scale using a relatively homogenous sample and the recommended kw statistic. One pair of mental health nurses completed the I-NMDS for mental health for a total of 30 clients attending a mental health day centre over a two-week period. Data was analysed using the kw and percentage agreement statistics. A total of 34 of the 38 I-NMDS for mental health variables with lower than acceptable levels of kw reliability scores achieved acceptable levels of reliability according to their percentage agreement scores. The study findings implied that, due to the homogeneity of the sample, low variability within the data resulted in the 'base rate problem' associated with the use of kw statistic. Conclusions point to the interpretation of kw in tandem with percentage agreement scores. Suggestions that kw scores were low due to chance agreement and that one should strive to use a study sample with known variability are queried.
Does sensitivity measured from screening test-sets predict clinical performance?

NASA Astrophysics Data System (ADS)

Soh, BaoLin P.; Lee, Warwick B.; Mello-Thoms, Claudia R.; Tapia, Kriscia A.; Ryan, John; Hung, Wai Tak; Thompson, Graham J.; Heard, Rob; Brennan, Patrick C.

2014-03-01

Aim: To examine the relationship between sensitivity measured from the BREAST test-set and clinical performance. Background: Although the UK and Australia national breast screening programs have regarded PERFORMS and BREAST test-set strategies as possible methods of estimating readers' clinical efficacy, the relationship between test-set and real life performance results has never been satisfactorily understood. Methods: Forty-one radiologists from BreastScreen New South Wales participated in this study. Each reader interpreted a BREAST test-set which comprised sixty de-identified mammographic examinations sourced from the BreastScreen Digital Imaging Library. Spearman's rank correlation coefficient was used to compare the sensitivity measured from the BREAST test-set with screen readers' clinical audit data. Results: Results shown statistically significant positive moderate correlations between test-set sensitivity and each of the following metrics: rate of invasive cancer per 10 000 reads (r=0.495; p < 0.01); rate of small invasive cancer per 10 000 reads (r=0.546; p < 0.001); detection rate of all invasive cancers and DCIS per 10 000 reads (r=0.444; p < 0.01). Conclusion: Comparison between sensitivity measured from the BREAST test-set and real life detection rate demonstrated statistically significant positive moderate correlations which validated that such test-set strategies can reflect readers' clinical performance and be used as a quality assurance tool. The strength of correlation demonstrated in this study was higher than previously found by others.
Software testing

NASA Astrophysics Data System (ADS)

Price-Whelan, Adrian M.

2016-01-01

Now more than ever, scientific results are dependent on sophisticated software and analysis. Why should we trust code written by others? How do you ensure your own code produces sensible results? How do you make sure it continues to do so as you update, modify, and add functionality? Software testing is an integral part of code validation and writing tests should be a requirement for any software project. I will talk about Python-based tools that make managing and running tests much easier and explore some statistics for projects hosted on GitHub that contain tests.
Performance of Bootstrapping Approaches To Model Test Statistics and Parameter Standard Error Estimation in Structural Equation Modeling.

ERIC Educational Resources Information Center

Nevitt, Jonathan; Hancock, Gregory R.

2001-01-01

Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
Residuals and the Residual-Based Statistic for Testing Goodness of Fit of Structural Equation Models

ERIC Educational Resources Information Center

Foldnes, Njal; Foss, Tron; Olsson, Ulf Henning

2012-01-01

The residuals obtained from fitting a structural equation model are crucial ingredients in obtaining chi-square goodness-of-fit statistics for the model. The authors present a didactic discussion of the residuals, obtaining a geometrical interpretation by recognizing the residuals as the result of oblique projections. This sheds light on the…
From Research to Practice: Basic Mathematics Skills and Success in Introductory Statistics

ERIC Educational Resources Information Center

Lunsford, M. Leigh; Poplin, Phillip

2011-01-01

Based on previous research of Johnson and Kuennen (2006), we conducted a study to determine factors that would possibly predict student success in an introductory statistics course. Our results were similar to Johnson and Kuennen in that we found students' basic mathematical skills, as measured on a test created by Johnson and Kuennen, were a…
Validation of a modification to Performance-Tested Method 010403: microwell DNA hybridization assay for detection of Listeria spp. in selected foods and selected environmental surfaces.

PubMed

Alles, Susan; Peng, Linda X; Mozola, Mark A

2009-01-01

A modification to Performance-Tested Method 010403, GeneQuence Listeria Test (DNAH method), is described. The modified method uses a new media formulation, LESS enrichment broth, in single-step enrichment protocols for both foods and environmental sponge and swab samples. Food samples are enriched for 27-30 h at 30 degrees C, and environmental samples for 24-48 h at 30 degrees C. Implementation of these abbreviated enrichment procedures allows test results to be obtained on a next-day basis. In testing of 14 food types in internal comparative studies with inoculated samples, there were statistically significant differences in method performance between the DNAH method and reference culture procedures for only 2 foods (pasteurized crab meat and lettuce) at the 27 h enrichment time point and for only a single food (pasteurized crab meat) in one trial at the 30 h enrichment time point. Independent laboratory testing with 3 foods showed statistical equivalence between the methods for all foods, and results support the findings of the internal trials. Overall, considering both internal and independent laboratory trials, sensitivity of the DNAH method relative to the reference culture procedures was 90.5%. Results of testing 5 environmental surfaces inoculated with various strains of Listeria spp. showed that the DNAH method was more productive than the reference U.S. Department of Agriculture-Food Safety and Inspection Service (USDA-FSIS) culture procedure for 3 surfaces (stainless steel, plastic, and cast iron), whereas results were statistically equivalent to the reference method for the other 2 surfaces (ceramic tile and sealed concrete). An independent laboratory trial with ceramic tile inoculated with L. monocytogenes confirmed the effectiveness of the DNAH method at the 24 h time point. Overall, sensitivity of the DNAH method at 24 h relative to that of the USDA-FSIS method was 152%. The DNAH method exhibited extremely high specificity, with only 1% false-positive reactions overall.
Testing the Isotropic Universe Using the Gamma-Ray Burst Data of Fermi/GBM

NASA Astrophysics Data System (ADS)

Řípa, Jakub; Shafieloo, Arman

2017-12-01

The sky distribution of gamma-ray bursts (GRBs) has been intensively studied by various groups for more than two decades. Most of these studies test the isotropy of GRBs based on their sky number density distribution. In this work, we propose an approach to test the isotropy of the universe through inspecting the isotropy of the properties of GRBs such as their duration, fluences, and peak fluxes at various energy bands and different timescales. We apply this method on the Fermi/Gamma-ray Burst Monitor (GBM) data sample containing 1591 GRBs. The most noticeable feature we found is near the Galactic coordinates l≈ 30^\\circ , b≈ 15^\\circ , and radius r≈ 20^\\circ {--}40^\\circ . The inferred probability for the occurrence of such an anisotropic signal (in a random isotropic sample) is derived to be less than a percent in some of the tests while the other tests give results consistent with isotropy. These are based on the comparison of the results from the real data with the randomly shuffled data samples. Considering the large number of statistics we used in this work (some of which are correlated with each other), we can anticipate that the detected feature could be a result of statistical fluctuations. Moreover, we noticed a considerably low number of GRBs in this particular patch, which might be due to some instrumentation or observational effects that can consequently affect our statistics through some systematics. Further investigation is highly desirable in order to clarify this result, e.g., utilizing a larger future Fermi/GBM data sample as well as data samples of other GRB missions and also looking for possible systematics.
Development of test scenarios for off-roadway crash countermeasures based on crash statistics

DOT National Transportation Integrated Search

2002-09-01

This report presents the results from an analysis of off-roadway crashes and proposes a set of crash-imminent scenarios to objectively test countermeasure systems for light vehicles (passenger cars, sport utility vehicles, vans, and pickup trucks) ba...

Usefulness of Leukocyte Esterase Test Versus Rapid Strep Test for Diagnosis of Acute Strep Pharyngitis

PubMed Central

2015-01-01

Objective: A study to compare the usage of throat swab testing for leukocyte esterase on a test strip(urine dip stick-multi stick) to rapid strep test for rapid diagnosis of Group A Beta hemolytic streptococci in cases of acute pharyngitis in children. Hypothesis: The testing of throat swab for leukocyte esterase on test strip currently used for urine testing may be used to detect throat infection and might be as useful as rapid strep. Methods: All patients who come with a complaint of sore throat and fever were examined clinically for erythema of pharynx, tonsils and also for any exudates. Informed consent was obtained from the parents and assent from the subjects. 3 swabs were taken from pharyngo-tonsillar region, testing for culture, rapid strep & Leukocyte Esterase. Results: Total number is 100. Cultures 9(+); for rapid strep== 84(-) and16 (+); For LE== 80(-) and 20(+) Statistics: From data configuration Rapid Strep versus LE test don’t seem to be a random (independent) assignment but extremely aligned. The Statistical results show rapid and LE show very agreeable results. Calculated Value of Chi Squared Exceeds Tabulated under 1 Degree Of Freedom (P<.0.0001) reject Null Hypothesis and Conclude Alternative Conclusions: Leukocyte esterase on throat swab is as useful as rapid strep test for rapid diagnosis of strep pharyngitis on test strip currently used for urine dip stick causing acute pharyngitis in children. PMID:27335975
Pisces did not have increased heart failure: data-driven comparisons of binary proportions between levels of a categorical variable can result in incorrect statistical significance levels.

PubMed

Austin, Peter C; Goldwasser, Meredith A

2008-03-01

We examined the impact on statistical inference when a chi(2) test is used to compare the proportion of successes in the level of a categorical variable that has the highest observed proportion of successes with the proportion of successes in all other levels of the categorical variable combined. Monte Carlo simulations and a case study examining the association between astrological sign and hospitalization for heart failure. A standard chi(2) test results in an inflation of the type I error rate, with the type I error rate increasing as the number of levels of the categorical variable increases. Using a standard chi(2) test, the hospitalization rate for Pisces was statistically significantly different from that of the other 11 astrological signs combined (P=0.026). After accounting for the fact that the selection of Pisces was based on it having the highest observed proportion of heart failure hospitalizations, subjects born under the sign of Pisces no longer had a significantly higher rate of heart failure hospitalization compared to the other residents of Ontario (P=0.152). Post hoc comparisons of the proportions of successes across different levels of a categorical variable can result in incorrect inferences.
A comparative evaluation of microleakage of three different newer direct composite resins using a self etching primer in class V cavities: An in vitro study

PubMed Central

Hegde, Mithra N; Vyapaka, Pallavi; Shetty, Shishir

2009-01-01

Aims/Objectives: The aim of this in vitro study is to study, measure and compare the microleakage in three different newer direct composite resins using a self-etch adhesive bonding system in class V cavities by fluorescent dye penetration technique. Materials and Methods: Class V cavities were prepared on 45 human maxillary premolar teeth. On all specimens, one coat of G-Bond (GC Japan) applied and light cured. Teeth are then equally divided into 3 groups of 15 samples each. Filtek Z350 (3M ESPE), Ceram X duo (Dentsply Asia) and Synergy D6 (Coltene/Whaledent) resin composites were placed on samples of Groups I, II and III, respectively, in increments and light cured. After polishing the restorations, the specimens were suspended in Rhodamine 6G fluorescent dye for 48 h. The teeth were then sectioned longitudinally and observed for the extent of microleakage under the florescent microscope. Statistical Analysis Used: The results were subjected to statistical analysis using Kruskal Wallis and Mann–Whitney U Test. Results: Results showed no statistically significant difference among three groups tested. Conclusions: None of the materials tested was able to completely eliminate the microleakage in class V cavities. PMID:20543926
The intermediates take it all: asymptotics of higher criticism statistics and a powerful alternative based on equal local levels.

PubMed

Gontscharuk, Veronika; Landwehr, Sandra; Finner, Helmut

2015-01-01

The higher criticism (HC) statistic, which can be seen as a normalized version of the famous Kolmogorov-Smirnov statistic, has a long history, dating back to the mid seventies. Originally, HC statistics were used in connection with goodness of fit (GOF) tests but they recently gained some attention in the context of testing the global null hypothesis in high dimensional data. The continuing interest for HC seems to be inspired by a series of nice asymptotic properties related to this statistic. For example, unlike Kolmogorov-Smirnov tests, GOF tests based on the HC statistic are known to be asymptotically sensitive in the moderate tails, hence it is favorably applied for detecting the presence of signals in sparse mixture models. However, some questions around the asymptotic behavior of the HC statistic are still open. We focus on two of them, namely, why a specific intermediate range is crucial for GOF tests based on the HC statistic and why the convergence of the HC distribution to the limiting one is extremely slow. Moreover, the inconsistency in the asymptotic and finite behavior of the HC statistic prompts us to provide a new HC test that has better finite properties than the original HC test while showing the same asymptotics. This test is motivated by the asymptotic behavior of the so-called local levels related to the original HC test. By means of numerical calculations and simulations we show that the new HC test is typically more powerful than the original HC test in normal mixture models. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Time series, periodograms, and significance

NASA Astrophysics Data System (ADS)

Hernandez, G.

1999-05-01

The geophysical literature shows a wide and conflicting usage of methods employed to extract meaningful information on coherent oscillations from measurements. This makes it difficult, if not impossible, to relate the findings reported by different authors. Therefore, we have undertaken a critical investigation of the tests and methodology used for determining the presence of statistically significant coherent oscillations in periodograms derived from time series. Statistical significance tests are only valid when performed on the independent frequencies present in a measurement. Both the number of possible independent frequencies in a periodogram and the significance tests are determined by the number of degrees of freedom, which is the number of true independent measurements, present in the time series, rather than the number of sample points in the measurement. The number of degrees of freedom is an intrinsic property of the data, and it must be determined from the serial coherence of the time series. As part of this investigation, a detailed study has been performed which clearly illustrates the deleterious effects that the apparently innocent and commonly used processes of filtering, de-trending, and tapering of data have on periodogram analysis and the consequent difficulties in the interpretation of the statistical significance thus derived. For the sake of clarity, a specific example of actual field measurements containing unevenly-spaced measurements, gaps, etc., as well as synthetic examples, have been used to illustrate the periodogram approach, and pitfalls, leading to the (statistical) significance tests for the presence of coherent oscillations. Among the insights of this investigation are: (1) the concept of a time series being (statistically) band limited by its own serial coherence and thus having a critical sampling rate which defines one of the necessary requirements for the proper statistical design of an experiment; (2) the design of a critical test for the maximum number of significant frequencies which can be used to describe a time series, while retaining intact the variance of the test sample; (3) a demonstration of the unnecessary difficulties that manipulation of the data brings into the statistical significance interpretation of said data; and (4) the resolution and correction of the apparent discrepancy in significance results obtained by the use of the conventional Lomb-Scargle significance test, when compared with the long-standing Schuster-Walker and Fisher tests.
Public health information and statistics dissemination efforts for Indonesia on the Internet.

PubMed

Hanani, Febiana; Kobayashi, Takashi; Jo, Eitetsu; Nakajima, Sawako; Oyama, Hiroshi

2011-01-01

To elucidate current issues related to health statistics dissemination efforts on the Internet in Indonesia and to propose a new dissemination website as a solution. A cross-sectional survey was conducted. Sources of statistics were identified using link relationship and Google™ search. Menu used to locate statistics, mode of presentation and means of access to statistics, and available statistics were assessed for each site. Assessment results were used to derive design specification; a prototype system was developed and evaluated with usability test. 49 sources were identified on 18 governmental, 8 international and 5 non-government websites. Of 49 menus identified, 33% used non-intuitive titles and lead to inefficient search. 69% of them were on government websites. Of 31 websites, only 39% and 23% used graph/chart and map for presentation. Further, only 32%, 39% and 19% provided query, export and print feature. While >50% sources reported morbidity, risk factor and service provision statistics, <40% sources reported health resource and mortality statistics. Statistics portal website was developed using Joomla!™ content management system. Usability test demonstrated its potential to improve data accessibility. In this study, government's efforts to disseminate statistics in Indonesia are supported by non-governmental and international organizations and existing their information may not be very useful because it is: a) not widely distributed, b) difficult to locate, and c) not effectively communicated. Actions are needed to ensure information usability, and one of such actions is the development of statistics portal website.
Vestibular evoked myogenic potentials and digital vectoelectro-nystagmography's study in patients with benign paroxysmal positional vertigo

PubMed Central

Maria da Silva Lira-Batista, Marta; Schaffeln Dorigueto, Ricardo; Freitas Ganança, Cristina

2013-01-01

Summary Introduction: Benign Paroxysmal Positional Vertigo (BPPV) is a very common vestibular disorder characterized by brief but intense attacks of rotatory vertigo triggered by simple rapid movement of the head. The integrity of the vestibular pathways can be assessed using tests such as digital vectoelectronystagmography (VENG) and vestibular evoked myogenic potentials (VEMP). Aim: This study aimed to determine the VEMP findings with respect to latency, amplitude, and waveform peak to peak and the results of the oculomotor and vestibular components of VENG in patients with BPPV. Method: Although this otoneurological condition is quite common, little is known of the associated VEMP and VENG changes, making it important to research and describe these results. Results: We examined the records of 4438 patients and selected 35 charts after applying the inclusion and exclusion criteria. Of these, 26 patients were women and 9 men. The average age at diagnosis was 52.7 years, and the most prevalent physiological cause, accounting for 97.3% of cases, was ductolithiasis. There was a statistically significant association between normal hearing and mild contralateral sensorineural hearing loss. The results of the oculomotor tests were within the normal reference ranges for all subjects. Patients with BPPV exhibited symmetrical function of the semicircular canals in their synergistic pairs (p < 0.001). The caloric test showed statistically normal responses from the lateral canals. The waveforms of all patients were adequate, but the VEMP results for the data-crossing maneuver with positive positioning showed a trend toward a relationship for the left ear Lp13. There was also a trend towards an association between normal reflexes in the caloric test and the inter-peak VEMP of the left ear. It can be concluded that although there are some differences between the average levels of the VENG and VEMP results, these differences were not statistically significant. Conclusion: In conclusion, the results of audiologic assessment, hearing thresholds, positioning maneuvers, and caloric tests have no effect on the quantitative results of VEMP. Additional research is warranted to establish the relationships among VENG, VEMP, and BPPV, especially as concerns the oculomotor tests. PMID:25992006
A multi-analyte serum test for the detection of non-small cell lung cancer

PubMed Central

Farlow, E C; Vercillo, M S; Coon, J S; Basu, S; Kim, A W; Faber, L P; Warren, W H; Bonomi, P; Liptay, M J; Borgia, J A

2010-01-01

Background: In this study, we appraised a wide assortment of biomarkers previously shown to have diagnostic or prognostic value for non-small cell lung cancer (NSCLC) with the intent of establishing a multi-analyte serum test capable of identifying patients with lung cancer. Methods: Circulating levels of 47 biomarkers were evaluated against patient cohorts consisting of 90 NSCLC and 43 non-cancer controls using commercial immunoassays. Multivariate statistical methods were used on all biomarkers achieving statistical relevance to define an optimised panel of diagnostic biomarkers for NSCLC. The resulting biomarkers were fashioned into a classification algorithm and validated against serum from a second patient cohort. Results: A total of 14 analytes achieved statistical relevance upon evaluation. Multivariate statistical methods then identified a panel of six biomarkers (tumour necrosis factor-α, CYFRA 21-1, interleukin-1ra, matrix metalloproteinase-2, monocyte chemotactic protein-1 and sE-selectin) as being the most efficacious for diagnosing early stage NSCLC. When tested against a second patient cohort, the panel successfully classified 75 of 88 patients. Conclusions: Here, we report the development of a serum algorithm with high specificity for classifying patients with NSCLC against cohorts of various ‘high-risk' individuals. A high rate of false positives was observed within the cohort in which patients had non-neoplastic lung nodules, possibly as a consequence of the inflammatory nature of these conditions. PMID:20859284
The Influence of 16-year-old Students' Gender, Mental Abilities, and Motivation on their Reading and Drawing Submicrorepresentations Achievements

NASA Astrophysics Data System (ADS)

Devetak, Iztok; Aleksij Glažar, Saša

2010-08-01

Submicrorepresentations (SMRs) are a powerful tool for identifying misconceptions of chemical concepts and for generating proper mental models of chemical phenomena in students' long-term memory during chemical education. The main purpose of the study was to determine which independent variables (gender, formal reasoning abilities, visualization abilities, and intrinsic motivation for learning chemistry) have the maximum influence on students' reading and drawing SMRs. A total of 386 secondary school students (aged 16.3 years) participated in the study. The instruments used in the study were: test of Chemical Knowledge, Test of Logical Thinking, two tests of visualization abilities Patterns and Rotations, and questionnaire on Intrinsic Motivation for Learning Science. The results show moderate, but statistically significant correlations between students' intrinsic motivation, formal reasoning abilities and chemical knowledge at submicroscopic level based on reading and drawing SMRs. Visualization abilities are not statistically significantly correlated with students' success on items that comprise reading or drawing SMRs. It can be also concluded that there is a statistically significant difference between male and female students in solving problems that include reading or drawing SMRs. Based on these statistical results and content analysis of the sample problems, several educational strategies can be implemented for students to develop adequate mental models of chemical concepts on all three levels of representations.
Long-range memory and non-Markov statistical effects in human sensorimotor coordination

NASA Astrophysics Data System (ADS)

M. Yulmetyev, Renat; Emelyanova, Natalya; Hänggi, Peter; Gafarov, Fail; Prokhorov, Alexander

2002-12-01

In this paper, the non-Markov statistical processes and long-range memory effects in human sensorimotor coordination are investigated. The theoretical basis of this study is the statistical theory of non-stationary discrete non-Markov processes in complex systems (Phys. Rev. E 62, 6178 (2000)). The human sensorimotor coordination was experimentally studied by means of standard dynamical tapping test on the group of 32 young peoples with tap numbers up to 400. This test was carried out separately for the right and the left hand according to the degree of domination of each brain hemisphere. The numerical analysis of the experimental results was made with the help of power spectra of the initial time correlation function, the memory functions of low orders and the first three points of the statistical spectrum of non-Markovity parameter. Our observations demonstrate, that with the regard to results of the standard dynamic tapping-test it is possible to divide all examinees into five different dynamic types. We have introduced the conflict coefficient to estimate quantitatively the order-disorder effects underlying life systems. The last one reflects the existence of disbalance between the nervous and the motor human coordination. The suggested classification of the neurophysiological activity represents the dynamic generalization of the well-known neuropsychological types and provides the new approach in a modern neuropsychology.
ANTIPLAQUE AND ANTIGINGIVITIS EFFECT OF LIPPIA SIDOIDES. A DOUBLE-BLIND CLINICAL STUDY IN HUMANS

PubMed Central

Rodrigues, Ítalo Sarto Carvalho; Tavares, Vinícius Nascimento; Pereira, Sérgio Luís da Silva; da Costa, Flávio Nogueira

2009-01-01

Objectives: The antiplaque and antigingivitis effect of Lippia Sidoides (LS) was evaluated in this in vivo investigation. Material and Methods: Twenty-three subjects participated in a cross-over, double-blind clinical study, using 21-day partial-mouth experimental model of gingivitis. A toothshield was constructed for each volunteer, avoiding the brushing of the 4 experimental posterior teeth in the lower left quadrant. The subjects were randomly assigned initially to use either the placebo gel (control group) or the test gel, containing 10% LS (test group). Results: The clinical results showed statistically significant differences for plaque index (PLI) (p<0.01) between days 0 and 21 in both groups, however only the control group showed statistically significant difference (p<0.01) for the bleeding (IB) and gingival (GI) index within the experimental period of 21 days. On day 21, the test group presented significantly better results than the control group with regard to the GI (p<0.05). Conclusions: The test gel containing 10% LS was effective in the control of gingivitis. PMID:19936516
A carcinogenic potency database of the standardized results of animal bioassays

PubMed Central

Gold, Lois Swirsky; Sawyer, Charles B.; Magaw, Renae; Backman, Georganne M.; De Veciana, Margarita; Levinson, Robert; Hooper, N. Kim; Havender, William R.; Bernstein, Leslie; Peto, Richard; Pike, Malcolm C.; Ames, Bruce N.

1984-01-01

The preceding paper described our numerical index of carcinogenic potency, the TD50 and the statistical procedures adopted for estimating it from experimental data. This paper presents the Carcinogenic Potency Database, which includes results of about 3000 long-term, chronic experiments of 770 test compounds. Part II is a discussion of the sources of our data, the rationale for the inclusion of particular experiments and particular target sites, and the conventions adopted in summarizing the literature. Part III is a guide to the plot of results presented in Part IV. A number of appendices are provided to facilitate use of the database. The plot includes information about chronic cancer tests in mammals, such as dose and other aspects of experimental protocol, histopathology and tumor incidence, TD50 and its statistical significance, dose response, author's opinion and literature reference. The plot readily permits comparisons of carcinogenic potency and many other aspects of cancer tests; it also provides quantitative information about negative tests. The range of carcinogenic potency is over 10 million-fold. PMID:6525996
Temporal changes and variability in temperature series over Peninsular Malaysia

NASA Astrophysics Data System (ADS)

Suhaila, Jamaludin

2015-02-01

With the current concern over climate change, the descriptions on how temperature series changed over time are very useful. Annual mean temperature has been analyzed for several stations over Peninsular Malaysia. Non-parametric statistical techniques such as Mann-Kendall test and Theil-Sen slope estimation are used primarily for assessing the significance and detection of trends, while a nonparametric Pettitt's test and sequential Mann-Kendall test are adopted to detect any abrupt climate change. Statistically significance increasing trends for annual mean temperature are detected for almost all studied stations with the magnitude of significant trend varied from 0.02°C to 0.05°C per year. The results shows that climate over Peninsular Malaysia is getting warmer than before. In addition, the results of the abrupt changes in temperature using Pettitt's and sequential Mann-Kendall test reveal the beginning of trends which can be related to El Nino episodes that occur in Malaysia. In general, the analysis results can help local stakeholders and water managers to understand the risks and vulnerabilities related to climate change in terms of mean events in the region.
Comparison of a non-stationary voxelation-corrected cluster-size test with TFCE for group-Level MRI inference.

PubMed

Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong

2017-03-01

Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Detection of semi-volatile organic compounds in permeable ...

EPA Pesticide Factsheets

Abstract The Edison Environmental Center (EEC) has a research and demonstration permeable parking lot comprised of three different permeable systems: permeable asphalt, porous concrete and interlocking concrete permeable pavers. Water quality and quantity analysis has been ongoing since January, 2010. This paper describes a subset of the water quality analysis, analysis of semivolatile organic compounds (SVOCs) to determine if hydrocarbons were in water infiltrated through the permeable surfaces. SVOCs were analyzed in samples collected from 11 dates over a 3 year period, from 2/8/2010 to 4/1/2013.Results are broadly divided into three categories: 42 chemicals were never detected; 12 chemicals (11 chemical test) were detected at a rate of less than 10% or less; and 22 chemicals were detected at a frequency of 10% or greater (ranging from 10% to 66.5% detections). Fundamental and exploratory statistical analyses were performed on these latter analyses results by grouping results by surface type. The statistical analyses were limited due to low frequency of detections and dilutions of samples which impacted detection limits. The infiltrate data through three permeable surfaces were analyzed as non-parametric data by the Kaplan-Meier estimation method for fundamental statistics; there were some statistically observable difference in concentration between pavement types when using Tarone-Ware Comparison Hypothesis Test. Additionally Spearman Rank order non-parame
Examining the Stationarity Assumption for Statistically Downscaled Climate Projections of Precipitation

NASA Astrophysics Data System (ADS)

Wootten, A.; Dixon, K. W.; Lanzante, J. R.; Mcpherson, R. A.

2017-12-01

Empirical statistical downscaling (ESD) approaches attempt to refine global climate model (GCM) information via statistical relationships between observations and GCM simulations. The aim of such downscaling efforts is to create added-value climate projections by adding finer spatial detail and reducing biases. The results of statistical downscaling exercises are often used in impact assessments under the assumption that past performance provides an indicator of future results. Given prior research describing the danger of this assumption with regards to temperature, this study expands the perfect model experimental design from previous case studies to test the stationarity assumption with respect to precipitation. Assuming stationarity implies the performance of ESD methods are similar between the future projections and historical training. Case study results from four quantile-mapping based ESD methods demonstrate violations of the stationarity assumption for both central tendency and extremes of precipitation. These violations vary geographically and seasonally. For the four ESD methods tested the greatest challenges for downscaling of daily total precipitation projections occur in regions with limited precipitation and for extremes of precipitation along Southeast coastal regions. We conclude with a discussion of future expansion of the perfect model experimental design and the implications for improving ESD methods and providing guidance on the use of ESD techniques for impact assessments and decision-support.
Statistical analysis of flight times for space shuttle ferry flights

NASA Technical Reports Server (NTRS)

Graves, M. E.; Perlmutter, M.

1974-01-01

Markov chain and Monte Carlo analysis techniques are applied to the simulated Space Shuttle Orbiter Ferry flights to obtain statistical distributions of flight time duration between Edwards Air Force Base and Kennedy Space Center. The two methods are compared, and are found to be in excellent agreement. The flights are subjected to certain operational and meteorological requirements, or constraints, which cause eastbound and westbound trips to yield different results. Persistence of events theory is applied to the occurrence of inclement conditions to find their effect upon the statistical flight time distribution. In a sensitivity test, some of the constraints are varied to observe the corresponding changes in the results.
When the Test of Mediation is More Powerful than the Test of the Total Effect

PubMed Central

O'Rourke, Holly P.; MacKinnon, David P.

2014-01-01

Although previous research has studied power in mediation models, the extent to which the inclusion of a mediator will increase power has not been investigated. First, a study compared analytical power of the mediated effect to the total effect in a single mediator model to identify the situations in which the inclusion of one mediator increased statistical power. Results from the first study indicated that including a mediator increased statistical power in small samples with large coefficients and in large samples with small coefficients, and when coefficients were non-zero and equal across models. Next, a study identified conditions where power was greater for the test of the total mediated effect compared to the test of the total effect in the parallel two mediator model. Results indicated that including two mediators increased power in small samples with large coefficients and in large samples with small coefficients, the same pattern of results found in the first study. Finally, a study assessed analytical power for a sequential (three-path) two mediator model and compared power to detect the three-path mediated effect to power to detect both the test of the total effect and the test of the mediated effect for the single mediator model. Results indicated that the three-path mediated effect had more power than the mediated effect from the single mediator model and the test of the total effect. Practical implications of these results for researchers are then discussed. PMID:24903690
On the assessment of the added value of new predictive biomarkers.

PubMed

Chen, Weijie; Samuelson, Frank W; Gallas, Brandon D; Kang, Le; Sahiner, Berkman; Petrick, Nicholas

2013-07-29

The surge in biomarker development calls for research on statistical evaluation methodology to rigorously assess emerging biomarkers and classification models. Recently, several authors reported the puzzling observation that, in assessing the added value of new biomarkers to existing ones in a logistic regression model, statistical significance of new predictor variables does not necessarily translate into a statistically significant increase in the area under the ROC curve (AUC). Vickers et al. concluded that this inconsistency is because AUC "has vastly inferior statistical properties," i.e., it is extremely conservative. This statement is based on simulations that misuse the DeLong et al. method. Our purpose is to provide a fair comparison of the likelihood ratio (LR) test and the Wald test versus diagnostic accuracy (AUC) tests. We present a test to compare ideal AUCs of nested linear discriminant functions via an F test. We compare it with the LR test and the Wald test for the logistic regression model. The null hypotheses of these three tests are equivalent; however, the F test is an exact test whereas the LR test and the Wald test are asymptotic tests. Our simulation shows that the F test has the nominal type I error even with a small sample size. Our results also indicate that the LR test and the Wald test have inflated type I errors when the sample size is small, while the type I error converges to the nominal value asymptotically with increasing sample size as expected. We further show that the DeLong et al. method tests a different hypothesis and has the nominal type I error when it is used within its designed scope. Finally, we summarize the pros and cons of all four methods we consider in this paper. We show that there is nothing inherently less powerful or disagreeable about ROC analysis for showing the usefulness of new biomarkers or characterizing the performance of classification models. Each statistical method for assessing biomarkers and classification models has its own strengths and weaknesses. Investigators need to choose methods based on the assessment purpose, the biomarker development phase at which the assessment is being performed, the available patient data, and the validity of assumptions behind the methodologies.
Reliability and validity of a nutrition and physical activity environmental self-assessment for child care

PubMed Central

Benjamin, Sara E; Neelon, Brian; Ball, Sarah C; Bangdiwala, Shrikant I; Ammerman, Alice S; Ward, Dianne S

2007-01-01

Background Few assessment instruments have examined the nutrition and physical activity environments in child care, and none are self-administered. Given the emerging focus on child care settings as a target for intervention, a valid and reliable measure of the nutrition and physical activity environment is needed. Methods To measure inter-rater reliability, 59 child care center directors and 109 staff completed the self-assessment concurrently, but independently. Three weeks later, a repeat self-assessment was completed by a sub-sample of 38 directors to assess test-retest reliability. To assess criterion validity, a researcher-administered environmental assessment was conducted at 69 centers and was compared to a self-assessment completed by the director. A weighted kappa test statistic and percent agreement were calculated to assess agreement for each question on the self-assessment. Results For inter-rater reliability, kappa statistics ranged from 0.20 to 1.00 across all questions. Test-retest reliability of the self-assessment yielded kappa statistics that ranged from 0.07 to 1.00. The inter-quartile kappa statistic ranges for inter-rater and test-retest reliability were 0.45 to 0.63 and 0.27 to 0.45, respectively. When percent agreement was calculated, questions ranged from 52.6% to 100% for inter-rater reliability and 34.3% to 100% for test-retest reliability. Kappa statistics for validity ranged from -0.01 to 0.79, with an inter-quartile range of 0.08 to 0.34. Percent agreement for validity ranged from 12.9% to 93.7%. Conclusion This study provides estimates of criterion validity, inter-rater reliability and test-retest reliability for an environmental nutrition and physical activity self-assessment instrument for child care. Results indicate that the self-assessment is a stable and reasonably accurate instrument for use with child care interventions. We therefore recommend the Nutrition and Physical Activity Self-Assessment for Child Care (NAP SACC) instrument to researchers and practitioners interested in conducting healthy weight intervention in child care. However, a more robust, less subjective measure would be more appropriate for researchers seeking an outcome measure to assess intervention impact. PMID:17615078

The Deployment Life Study: Longitudinal Analysis of Military Families Across the Deployment Cycle

DTIC Science & Technology

2016-01-01

psychological and physical aggression than they reported prior to the deployment. 1 H. Fischer, A Guide to U.S. Military Casualty Statistics ...analyses include a large number of statistical tests and thus the results pre- sented in this report should be viewed in terms of patterns, rather...Military Children and Families,” The Future of Children, Vol. 23, No. 2, 2013, pp. 13–39. Fischer, H., A Guide to U.S. Military Casualty Statistics
Bayesian networks and statistical analysis application to analyze the diagnostic test accuracy

NASA Astrophysics Data System (ADS)

Orzechowski, P.; Makal, Jaroslaw; Onisko, A.

2005-02-01

The computer aided BPH diagnosis system based on Bayesian network is described in the paper. First result are compared to a given statistical method. Different statistical methods are used successfully in medicine for years. However, the undoubted advantages of probabilistic methods make them useful in application in newly created systems which are frequent in medicine, but do not have full and competent knowledge. The article presents advantages of the computer aided BPH diagnosis system in clinical practice for urologists.
The Global Error Assessment (GEA) model for the selection of differentially expressed genes in microarray data.

PubMed

Mansourian, Robert; Mutch, David M; Antille, Nicolas; Aubert, Jerome; Fogel, Paul; Le Goff, Jean-Marc; Moulin, Julie; Petrov, Anton; Rytz, Andreas; Voegel, Johannes J; Roberts, Matthew-Alan

2004-11-01

Microarray technology has become a powerful research tool in many fields of study; however, the cost of microarrays often results in the use of a low number of replicates (k). Under circumstances where k is low, it becomes difficult to perform standard statistical tests to extract the most biologically significant experimental results. Other more advanced statistical tests have been developed; however, their use and interpretation often remain difficult to implement in routine biological research. The present work outlines a method that achieves sufficient statistical power for selecting differentially expressed genes under conditions of low k, while remaining as an intuitive and computationally efficient procedure. The present study describes a Global Error Assessment (GEA) methodology to select differentially expressed genes in microarray datasets, and was developed using an in vitro experiment that compared control and interferon-gamma treated skin cells. In this experiment, up to nine replicates were used to confidently estimate error, thereby enabling methods of different statistical power to be compared. Gene expression results of a similar absolute expression are binned, so as to enable a highly accurate local estimate of the mean squared error within conditions. The model then relates variability of gene expression in each bin to absolute expression levels and uses this in a test derived from the classical ANOVA. The GEA selection method is compared with both the classical and permutational ANOVA tests, and demonstrates an increased stability, robustness and confidence in gene selection. A subset of the selected genes were validated by real-time reverse transcription-polymerase chain reaction (RT-PCR). All these results suggest that GEA methodology is (i) suitable for selection of differentially expressed genes in microarray data, (ii) intuitive and computationally efficient and (iii) especially advantageous under conditions of low k. The GEA code for R software is freely available upon request to authors.
Random fractional ultrapulsed CO2 resurfacing of photodamaged facial skin: long-term evaluation.

PubMed

Tretti Clementoni, Matteo; Galimberti, Michela; Tourlaki, Athanasia; Catenacci, Maximilian; Lavagno, Rosalia; Bencini, Pier Luca

2013-02-01

Although numerous papers have recently been published on ablative fractional resurfacing, there is a lack of information in literature on very long-term results. The aim of this retrospective study is to evaluate the efficacy, adverse side effects, and long-term results of a random fractional ultrapulsed CO2 laser on a large population with photodamaged facial skin. Three hundred twelve patients with facial photodamaged skin were enrolled and underwent a single full-face treatment. Six aspects of photodamaged skin were recorded using a 5 point scale at 3, 6, and 24 months after the treatment. The results were compared with a non-parametric statistical test, the Wilcoxon's exact test. Three hundred one patients completed the study. All analyzed features showed a significant statistical improvement 3 months after the procedure. Three months later all features, except for pigmentations, once again showed a significant statistical improvement. Results after 24 months were similar to those assessed 18 months before. No long-term or other serious complications were observed. From the significant number of patients analyzed, long-term results demonstrate not only how fractional ultrapulsed CO2 resurfacing can achieve good results on photodamaged facial skin but also how these results can be considered stable 2 years after the procedure.
Examining publication bias—a simulation-based evaluation of statistical tests on publication bias

PubMed Central

2017-01-01

Background Publication bias is a form of scientific misconduct. It threatens the validity of research results and the credibility of science. Although several tests on publication bias exist, no in-depth evaluations are available that examine which test performs best for different research settings. Methods Four tests on publication bias, Egger’s test (FAT), p-uniform, the test of excess significance (TES), as well as the caliper test, were evaluated in a Monte Carlo simulation. Two different types of publication bias and its degree (0%, 50%, 100%) were simulated. The type of publication bias was defined either as file-drawer, meaning the repeated analysis of new datasets, or p-hacking, meaning the inclusion of covariates in order to obtain a significant result. In addition, the underlying effect (β = 0, 0.5, 1, 1.5), effect heterogeneity, the number of observations in the simulated primary studies (N = 100, 500), and the number of observations for the publication bias tests (K = 100, 1,000) were varied. Results All tests evaluated were able to identify publication bias both in the file-drawer and p-hacking condition. The false positive rates were, with the exception of the 15%- and 20%-caliper test, unbiased. The FAT had the largest statistical power in the file-drawer conditions, whereas under p-hacking the TES was, except under effect heterogeneity, slightly better. The CTs were, however, inferior to the other tests under effect homogeneity and had a decent statistical power only in conditions with 1,000 primary studies. Discussion The FAT is recommended as a test for publication bias in standard meta-analyses with no or only small effect heterogeneity. If two-sided publication bias is suspected as well as under p-hacking the TES is the first alternative to the FAT. The 5%-caliper test is recommended under conditions of effect heterogeneity and a large number of primary studies, which may be found if publication bias is examined in a discipline-wide setting when primary studies cover different research problems. PMID:29204324
fMRI reliability: influences of task and experimental design.

PubMed

Bennett, Craig M; Miller, Michael B

2013-12-01

As scientists, it is imperative that we understand not only the power of our research tools to yield results, but also their ability to obtain similar results over time. This study is an investigation into how common decisions made during the design and analysis of a functional magnetic resonance imaging (fMRI) study can influence the reliability of the statistical results. To that end, we gathered back-to-back test-retest fMRI data during an experiment involving multiple cognitive tasks (episodic recognition and two-back working memory) and multiple fMRI experimental designs (block, event-related genetic sequence, and event-related m-sequence). Using these data, we were able to investigate the relative influences of task, design, statistical contrast (task vs. rest, target vs. nontarget), and statistical thresholding (unthresholded, thresholded) on fMRI reliability, as measured by the intraclass correlation (ICC) coefficient. We also utilized data from a second study to investigate test-retest reliability after an extended, six-month interval. We found that all of the factors above were statistically significant, but that they had varying levels of influence on the observed ICC values. We also found that these factors could interact, increasing or decreasing the relative reliability of certain Task × Design combinations. The results suggest that fMRI reliability is a complex construct whose value may be increased or decreased by specific combinations of factors.
An evaluation of grease-type ball bearing lubricants operation in various environments

NASA Technical Reports Server (NTRS)

Mcmurtrey, E. L.

1983-01-01

Because many future spacecraft or space stations will require mechanisms to operate for long periods of time in environments which are adverse to most bearing lubricants, a series of tests is continuing to evaluate 38 grease type lubricants in R-4 size bearings in five different environments for a 1 year period. Four repetitions of each test are made to provide statistical samples. These tests have also been used to select four lubricants for 5 year tests in selected environments with five repetitions of each test for statistical samples. At the present time, 142 test sets have been completed and 30 test sets are underway. The three 5 year tests in (1) continuous operation and (2) start stop operation, with both in vacuum at ambient temperatures, and (3) continuous vacuum operation at 93.3 C are now completed. To date, in both the 1 year and 5 year tests, the best results in all environments have been obtained with a high viscosity index perfluoroalkylpolyether (PFPE) grease.
Providing peak river flow statistics and forecasting in the Niger River basin

NASA Astrophysics Data System (ADS)

Andersson, Jafet C. M.; Ali, Abdou; Arheimer, Berit; Gustafsson, David; Minoungou, Bernard

2017-08-01

Flooding is a growing concern in West Africa. Improved quantification of discharge extremes and associated uncertainties is needed to improve infrastructure design, and operational forecasting is needed to provide timely warnings. In this study, we use discharge observations, a hydrological model (Niger-HYPE) and extreme value analysis to estimate peak river flow statistics (e.g. the discharge magnitude with a 100-year return period) across the Niger River basin. To test the model's capacity of predicting peak flows, we compared 30-year maximum discharge and peak flow statistics derived from the model vs. derived from nine observation stations. The results indicate that the model simulates peak discharge reasonably well (on average + 20%). However, the peak flow statistics have a large uncertainty range, which ought to be considered in infrastructure design. We then applied the methodology to derive basin-wide maps of peak flow statistics and their associated uncertainty. The results indicate that the method is applicable across the hydrologically active part of the river basin, and that the uncertainty varies substantially depending on location. Subsequently, we used the most recent bias-corrected climate projections to analyze potential changes in peak flow statistics in a changed climate. The results are generally ambiguous, with consistent changes only in very few areas. To test the forecasting capacity, we ran Niger-HYPE with a combination of meteorological data sets for the 2008 high-flow season and compared with observations. The results indicate reasonable forecasting capacity (on average 17% deviation), but additional years should also be evaluated. We finish by presenting a strategy and pilot project which will develop an operational flood monitoring and forecasting system based in-situ data, earth observations, modelling, and extreme statistics. In this way we aim to build capacity to ultimately improve resilience toward floods, protecting lives and infrastructure in the region.
The results of STEM education methods for enhancing critical thinking and problem solving skill in physics the 10th grade level

NASA Astrophysics Data System (ADS)

Soros, P.; Ponkham, K.; Ekkapim, S.

2018-01-01

This research aimed to: 1) compare the critical think and problem solving skills before and after learning using STEM Education plan, 2) compare student achievement before and after learning about force and laws of motion using STEM Education plan, and 3) the satisfaction of learning by using STEM Education. The sample used were 37 students from grade 10 at Borabu School, Borabu District, Mahasarakham Province, semester 2, Academic year 2016. Tools used in this study consist of: 1) STEM Education plan about the force and laws of motion for grade 10 students of 1 schemes with total of 14 hours, 2) The test of critical think and problem solving skills with multiple-choice type of 5 options and 2 option of 30 items, 3) achievement test on force and laws of motion with multiple-choice of 4 options of 30 items, 4) satisfaction learning with 5 Rating Scale of 20 items. The statistics used in data analysis were percentage, mean, standard deviation, and t-test (Dependent). The results showed that 1) The student with learning using STEM Education plan have score of critical think and problem solving skills on post-test higher than pre-test with statistically significant level .01. 2) The student with learning using STEM Education plan have achievement score on post-test higher than pre-test with statistically significant level of .01. 3) The student'level of satisfaction toward the learning by using STEM Education plan was at a high level (X ¯ = 4.51, S.D=0.56).
Mode-Stirred Method Implementation for HIRF Susceptibility Testing and Results Comparison with Anechoic Method

NASA Technical Reports Server (NTRS)

Nguyen, Truong X.; Ely, Jay J.; Koppen, Sandra V.

2001-01-01

This paper describes the implementation of mode-stirred method for susceptibility testing according to the current DO-160D standard. Test results on an Engine Data Processor using the implemented procedure and the comparisons with the standard anechoic test results are presented. The comparison experimentally shows that the susceptibility thresholds found in mode-stirred method are consistently higher than anechoic. This is consistent with the recent statistical analysis finding by NIST that the current calibration procedure overstates field strength by a fixed amount. Once the test results are adjusted for this value, the comparisons with the anechoic results are excellent. The results also show that test method has excellent chamber to chamber repeatability. Several areas for improvements to the current procedure are also identified and implemented.
A power comparison of generalized additive models and the spatial scan statistic in a case-control setting

PubMed Central

2010-01-01

Background A common, important problem in spatial epidemiology is measuring and identifying variation in disease risk across a study region. In application of statistical methods, the problem has two parts. First, spatial variation in risk must be detected across the study region and, second, areas of increased or decreased risk must be correctly identified. The location of such areas may give clues to environmental sources of exposure and disease etiology. One statistical method applicable in spatial epidemiologic settings is a generalized additive model (GAM) which can be applied with a bivariate LOESS smoother to account for geographic location as a possible predictor of disease status. A natural hypothesis when applying this method is whether residential location of subjects is associated with the outcome, i.e. is the smoothing term necessary? Permutation tests are a reasonable hypothesis testing method and provide adequate power under a simple alternative hypothesis. These tests have yet to be compared to other spatial statistics. Results This research uses simulated point data generated under three alternative hypotheses to evaluate the properties of the permutation methods and compare them to the popular spatial scan statistic in a case-control setting. Case 1 was a single circular cluster centered in a circular study region. The spatial scan statistic had the highest power though the GAM method estimates did not fall far behind. Case 2 was a single point source located at the center of a circular cluster and Case 3 was a line source at the center of the horizontal axis of a square study region. Each had linearly decreasing logodds with distance from the point. The GAM methods outperformed the scan statistic in Cases 2 and 3. Comparing sensitivity, measured as the proportion of the exposure source correctly identified as high or low risk, the GAM methods outperformed the scan statistic in all three Cases. Conclusions The GAM permutation testing methods provide a regression-based alternative to the spatial scan statistic. Across all hypotheses examined in this research, the GAM methods had competing or greater power estimates and sensitivities exceeding that of the spatial scan statistic. PMID:20642827
A Powerful Test for Comparing Multiple Regression Functions.

PubMed

Maity, Arnab

2012-09-01

In this article, we address the important problem of comparison of two or more population regression functions. Recently, Pardo-Fernández, Van Keilegom and González-Manteiga (2007) developed test statistics for simple nonparametric regression models: Y(ij) = θ(j)(Z(ij)) + σ(j)(Z(ij))∊(ij), based on empirical distributions of the errors in each population j = 1, … , J. In this paper, we propose a test for equality of the θ(j)(·) based on the concept of generalized likelihood ratio type statistics. We also generalize our test for other nonparametric regression setups, e.g, nonparametric logistic regression, where the loglikelihood for population j is any general smooth function [Formula: see text]. We describe a resampling procedure to obtain the critical values of the test. In addition, we present a simulation study to evaluate the performance of the proposed test and compare our results to those in Pardo-Fernández et al. (2007).
Quality control analysis : part I : asphaltic concrete.

DOT National Transportation Integrated Search

1964-11-01

This report deals with the statistical evaluation of results from several hot mix plants to determine the pattern of variability with respect to bituminous hot mix characteristics. : Individual tests results when subjected to frequency distribution i...
Measuring the Sensitivity of Single-locus “Neutrality Tests” Using a Direct Perturbation Approach

PubMed Central

Garrigan, Daniel; Lewontin, Richard; Wakeley, John

2010-01-01

A large number of statistical tests have been proposed to detect natural selection based on a sample of variation at a single genetic locus. These tests measure the deviation of the allelic frequency distribution observed within populations from the distribution expected under a set of assumptions that includes both neutral evolution and equilibrium population demography. The present study considers a new way to assess the statistical properties of these tests of selection, by their behavior in response to direct perturbations of the steady-state allelic frequency distribution, unconstrained by any particular nonequilibrium demographic scenario. Results from Monte Carlo computer simulations indicate that most tests of selection are more sensitive to perturbations of the allele frequency distribution that increase the variance in allele frequencies than to perturbations that decrease the variance. Simulations also demonstrate that it requires, on average, 4N generations (N is the diploid effective population size) for tests of selection to relax to their theoretical, steady-state distributions following different perturbations of the allele frequency distribution to its extremes. This relatively long relaxation time highlights the fact that these tests are not robust to violations of the other assumptions of the null model besides neutrality. Lastly, genetic variation arising under an example of a regularly cycling demographic scenario is simulated. Tests of selection performed on this last set of simulated data confirm the confounding nature of these tests for the inference of natural selection, under a demographic scenario that likely holds for many species. The utility of using empirical, genomic distributions of test statistics, instead of the theoretical steady-state distribution, is discussed as an alternative for improving the statistical inference of natural selection. PMID:19744997
ArraySolver: an algorithm for colour-coded graphical display and Wilcoxon signed-rank statistics for comparing microarray gene expression data.

PubMed

Khan, Haseeb Ahmad

2004-01-01

The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann-Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n < or = 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform.
ArraySolver: An Algorithm for Colour-Coded Graphical Display and Wilcoxon Signed-Rank Statistics for Comparing Microarray Gene Expression Data

PubMed Central

2004-01-01

The massive surge in the production of microarray data poses a great challenge for proper analysis and interpretation. In recent years numerous computational tools have been developed to extract meaningful interpretation of microarray gene expression data. However, a convenient tool for two-groups comparison of microarray data is still lacking and users have to rely on commercial statistical packages that might be costly and require special skills, in addition to extra time and effort for transferring data from one platform to other. Various statistical methods, including the t-test, analysis of variance, Pearson test and Mann–Whitney U test, have been reported for comparing microarray data, whereas the utilization of the Wilcoxon signed-rank test, which is an appropriate test for two-groups comparison of gene expression data, has largely been neglected in microarray studies. The aim of this investigation was to build an integrated tool, ArraySolver, for colour-coded graphical display and comparison of gene expression data using the Wilcoxon signed-rank test. The results of software validation showed similar outputs with ArraySolver and SPSS for large datasets. Whereas the former program appeared to be more accurate for 25 or fewer pairs (n ≤ 25), suggesting its potential application in analysing molecular signatures that usually contain small numbers of genes. The main advantages of ArraySolver are easy data selection, convenient report format, accurate statistics and the familiar Excel platform. PMID:18629036
Echo Statistics of Aggregations of Scatterers in a Random Waveguide: Application to Biologic Sonar Clutter

DTIC Science & Technology

2012-09-01

used in this paper to compare probability density functions, the Lilliefors test and the Kullback - Leibler distance. The Lilliefors test is a goodness ... of interest in this study are the Rayleigh distribution and the exponential distribution. The Lilliefors test is used to test goodness - of - fit for...Lilliefors test for goodness of fit with an exponential distribution. These results suggests that,
Evaluation of nitrous oxide as a substitute for sulfur hexafluoride to reduce global warming impacts of ANSI/HPS N13.1 gaseous uniformity testing

NASA Astrophysics Data System (ADS)

Yu, Xiao-Ying; Barnett, J. Matthew; Amidan, Brett G.; Recknagle, Kurtis P.; Flaherty, Julia E.; Antonio, Ernest J.; Glissmeyer, John A.

2018-03-01

The ANSI/HPS N13.1-2011 standard requires gaseous tracer uniformity testing for sampling associated with stacks used in radioactive air emissions. Sulfur hexafluoride (SF6), a greenhouse gas with a high global warming potential, has long been the gas tracer used in such testing. To reduce the impact of gas tracer tests on the environment, nitrous oxide (N2O) was evaluated as a potential replacement to SF6. The physical evaluation included the development of a test plan to record percent coefficient of variance and the percent maximum deviation between the two gases while considering variables such as fan configuration, injection position, and flow rate. Statistical power was calculated to determine how many sample sets were needed, and computational fluid dynamic modeling was utilized to estimate overall mixing in stacks. Results show there are no significant differences between the behaviors of the two gases, and SF6 modeling corroborated N2O test results. Although, in principle, all tracer gases should behave in an identical manner for measuring mixing within a stack, the series of physical tests guided by statistics was performed to demonstrate the equivalence of N2O testing to SF6 testing in the context of stack qualification tests. The results demonstrate that N2O is a viable choice leading to a four times reduction in global warming impacts for future similar compliance driven testing.
Forecasting volatility with neural regression: a contribution to model adequacy.

PubMed

Refenes, A N; Holt, W T

2001-01-01

Neural nets' usefulness for forecasting is limited by problems of overfitting and the lack of rigorous procedures for model identification, selection and adequacy testing. This paper describes a methodology for neural model misspecification testing. We introduce a generalization of the Durbin-Watson statistic for neural regression and discuss the general issues of misspecification testing using residual analysis. We derive a generalized influence matrix for neural estimators which enables us to evaluate the distribution of the statistic. We deploy Monte Carlo simulation to compare the power of the test for neural and linear regressors. While residual testing is not a sufficient condition for model adequacy, it is nevertheless a necessary condition to demonstrate that the model is a good approximation to the data generating process, particularly as neural-network estimation procedures are susceptible to partial convergence. The work is also an important step toward developing rigorous procedures for neural model identification, selection and adequacy testing which have started to appear in the literature. We demonstrate its applicability in the nontrivial problem of forecasting implied volatility innovations using high-frequency stock index options. Each step of the model building process is validated using statistical tests to verify variable significance and model adequacy with the results confirming the presence of nonlinear relationships in implied volatility innovations.
A new test of multivariate nonlinear causality

PubMed Central

Bai, Zhidong; Jiang, Dandan; Lv, Zhihui; Wong, Wing-Keung; Zheng, Shurong

2018-01-01

The multivariate nonlinear Granger causality developed by Bai et al. (2010) (Mathematics and Computers in simulation. 2010; 81: 5-17) plays an important role in detecting the dynamic interrelationships between two groups of variables. Following the idea of Hiemstra-Jones (HJ) test proposed by Hiemstra and Jones (1994) (Journal of Finance. 1994; 49(5): 1639-1664), they attempt to establish a central limit theorem (CLT) of their test statistic by applying the asymptotical property of multivariate U-statistic. However, Bai et al. (2016) (2016; arXiv: 1701.03992) revisit the HJ test and find that the test statistic given by HJ is NOT a function of U-statistics which implies that the CLT neither proposed by Hiemstra and Jones (1994) nor the one extended by Bai et al. (2010) is valid for statistical inference. In this paper, we re-estimate the probabilities and reestablish the CLT of the new test statistic. Numerical simulation shows that our new estimates are consistent and our new test performs decent size and power. PMID:29304085

A new test of multivariate nonlinear causality.

PubMed

Bai, Zhidong; Hui, Yongchang; Jiang, Dandan; Lv, Zhihui; Wong, Wing-Keung; Zheng, Shurong

2018-01-01

The multivariate nonlinear Granger causality developed by Bai et al. (2010) (Mathematics and Computers in simulation. 2010; 81: 5-17) plays an important role in detecting the dynamic interrelationships between two groups of variables. Following the idea of Hiemstra-Jones (HJ) test proposed by Hiemstra and Jones (1994) (Journal of Finance. 1994; 49(5): 1639-1664), they attempt to establish a central limit theorem (CLT) of their test statistic by applying the asymptotical property of multivariate U-statistic. However, Bai et al. (2016) (2016; arXiv: 1701.03992) revisit the HJ test and find that the test statistic given by HJ is NOT a function of U-statistics which implies that the CLT neither proposed by Hiemstra and Jones (1994) nor the one extended by Bai et al. (2010) is valid for statistical inference. In this paper, we re-estimate the probabilities and reestablish the CLT of the new test statistic. Numerical simulation shows that our new estimates are consistent and our new test performs decent size and power.
Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic.

PubMed

Satorra, Albert; Bentler, Peter M

2010-06-01

A scaled difference test statistic [Formula: see text] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (2001). The statistic [Formula: see text] is asymptotically equivalent to the scaled difference test statistic T̄(d) introduced in Satorra (2000), which requires more involved computations beyond standard output of SEM software. The test statistic [Formula: see text] has been widely used in practice, but in some applications it is negative due to negativity of its associated scaling correction. Using the implicit function theorem, this note develops an improved scaling correction leading to a new scaled difference statistic T̄(d) that avoids negative chi-square values.
Research design and statistical methods in Pakistan Journal of Medical Sciences (PJMS).

PubMed

Akhtar, Sohail; Shah, Syed Wadood Ali; Rafiq, M; Khan, Ajmal

2016-01-01

This article compares the study design and statistical methods used in 2005, 2010 and 2015 of Pakistan Journal of Medical Sciences (PJMS). Only original articles of PJMS were considered for the analysis. The articles were carefully reviewed for statistical methods and designs, and then recorded accordingly. The frequency of each statistical method and research design was estimated and compared with previous years. A total of 429 articles were evaluated (n=74 in 2005, n=179 in 2010, n=176 in 2015) in which 171 (40%) were cross-sectional and 116 (27%) were prospective study designs. A verity of statistical methods were found in the analysis. The most frequent methods include: descriptive statistics (n=315, 73.4%), chi-square/Fisher's exact tests (n=205, 47.8%) and student t-test (n=186, 43.4%). There was a significant increase in the use of statistical methods over time period: t-test, chi-square/Fisher's exact test, logistic regression, epidemiological statistics, and non-parametric tests. This study shows that a diverse variety of statistical methods have been used in the research articles of PJMS and frequency improved from 2005 to 2015. However, descriptive statistics was the most frequent method of statistical analysis in the published articles while cross-sectional study design was common study design.
Evaluation of the performance of statistical tests used in making cleanup decisions at Superfund sites. Part 1: Choosing an appropriate statistical test

DOE Office of Scientific and Technical Information (OSTI.GOV)

Berman, D.W.; Allen, B.C.; Van Landingham, C.B.

1998-12-31

The decision rules commonly employed to determine the need for cleanup are evaluated both to identify conditions under which they lead to erroneous conclusions and to quantify the rate that such errors occur. Their performance is also compared with that of other applicable decision rules. The authors based the evaluation of decision rules on simulations. Results are presented as power curves. These curves demonstrate that the degree of statistical control achieved is independent of the form of the null hypothesis. The loss of statistical control that occurs when a decision rule is applied to a data set that does notmore » satisfy the rule`s validity criteria is also clearly demonstrated. Some of the rules evaluated do not offer the formal statistical control that is an inherent design feature of other rules. Nevertheless, results indicate that such informal decision rules may provide superior overall control of error rates, when their application is restricted to data exhibiting particular characteristics. The results reported here are limited to decision rules applied to uncensored and lognormally distributed data. To optimize decision rules, it is necessary to evaluate their behavior when applied to data exhibiting a range of characteristics that bracket those common to field data. The performance of decision rules applied to data sets exhibiting a broader range of characteristics is reported in the second paper of this study.« less
[Effect sizes, statistical power and sample sizes in "the Japanese Journal of Psychology"].

PubMed

Suzukawa, Yumi; Toyoda, Hideki

2012-04-01

This study analyzed the statistical power of research studies published in the "Japanese Journal of Psychology" in 2008 and 2009. Sample effect sizes and sample statistical powers were calculated for each statistical test and analyzed with respect to the analytical methods and the fields of the studies. The results show that in the fields like perception, cognition or learning, the effect sizes were relatively large, although the sample sizes were small. At the same time, because of the small sample sizes, some meaningful effects could not be detected. In the other fields, because of the large sample sizes, meaningless effects could be detected. This implies that researchers who could not get large enough effect sizes would use larger samples to obtain significant results.
[The application of the prospective space-time statistic in early warning of infectious disease].

PubMed

Yin, Fei; Li, Xiao-Song; Feng, Zi-Jian; Ma, Jia-Qi

2007-06-01

To investigate the application of prospective space-time scan statistic in the early stage of detecting infectious disease outbreaks. The prospective space-time scan statistic was tested by mimicking daily prospective analyses of bacillary dysentery data of Chengdu city in 2005 (3212 cases in 102 towns and villages). And the results were compared with that of purely temporal scan statistic. The prospective space-time scan statistic could give specific messages both in spatial and temporal. The results of June indicated that the prospective space-time scan statistic could timely detect the outbreaks that started from the local site, and the early warning message was powerful (P = 0.007). When the merely temporal scan statistic for detecting the outbreak was sent two days later, and the signal was less powerful (P = 0.039). The prospective space-time scan statistic could make full use of the spatial and temporal information in infectious disease data and could timely and effectively detect the outbreaks that start from the local sites. The prospective space-time scan statistic could be an important tool for local and national CDC to set up early detection surveillance systems.
Results of the Verification of the Statistical Distribution Model of Microseismicity Emission Characteristics

NASA Astrophysics Data System (ADS)

Cianciara, Aleksander

2016-09-01

The paper presents the results of research aimed at verifying the hypothesis that the Weibull distribution is an appropriate statistical distribution model of microseismicity emission characteristics, namely: energy of phenomena and inter-event time. It is understood that the emission under consideration is induced by the natural rock mass fracturing. Because the recorded emission contain noise, therefore, it is subjected to an appropriate filtering. The study has been conducted using the method of statistical verification of null hypothesis that the Weibull distribution fits the empirical cumulative distribution function. As the model describing the cumulative distribution function is given in an analytical form, its verification may be performed using the Kolmogorov-Smirnov goodness-of-fit test. Interpretations by means of probabilistic methods require specifying the correct model describing the statistical distribution of data. Because in these methods measurement data are not used directly, but their statistical distributions, e.g., in the method based on the hazard analysis, or in that that uses maximum value statistics.
The Antinociceptive Effects of Hydroalcoholic Extract of Borago Officinalis Flower in Male Rats Using Formalin Test.

PubMed

Shahraki, Mohammad Reza; Ahmadimoghadm, Mahdieh; Shahraki, Ahmad Reza

2015-10-01

Borago officinalis flower (borage) is a known sedative in herbal medicine; the aim of the present study was to evaluate the antinociceptive effect of borage hydroalcoholic extract in formalin test male rats. Fifty-six adult male albino Wistar rats were randomly divided into seven groups: Control groups of A (intact), B (saline), and C (Positive control) plus test groups of D, E, F, and G (n=8). The groups D, E, and F received 6.25, 12.5, and 25 mg/kg, Borago officinalis flower hydroalcholic extract before the test, respectively but group G received 25 mg/kg borage extract and aspirin before the test. A biphasic pain was induced by injection of formalin 1%. The obtained data were analyzed by SPSS software ver. 17 employing statistical tests of Kruskal-Wallis and Mann-Whitney. The results were expressed as mean±SD. Statistical differences were considered significant at P<0.05. The results revealed that the acute and chronic pain behavior score in test groups of D, E, F, and G significantly decreased compared to groups A and B, but this score did not show any difference compared to group C. Moreover, chronic pain behavior score in group G was significantly lower than all other groups. The results indicated that Borago officinalis hydroalcoholic extract affects the acute and chronic pain behavior response in formaline test male rats.
2005 8th Annual Systems Engineering Conference. Volume 4, Thursday

DTIC Science & Technology

2005-10-27

requirements, allocation , and utilization statistics Operations Decisions Acquisition Decisions Resource Management — Integrated Requirements/ Allocation ...Quality Improvement Consultants, Inc. “Automated Software Testing Increases Test Quality and Coverage Resulting in Improved Software Reliability.”, Mr...Steven Ligon, SAIC The Return of Discipline, Ms. Jacqueline Townsend, Air Force Materiel Command Track 4 - Net Centric Operations: Testing Net-Centric
Comparability of Computer- and Paper-Administered Multiple-Choice Tests for K-12 Populations: A Synthesis

ERIC Educational Resources Information Center

Kingston, Neal M.

2009-01-01

There have been many studies of the comparability of computer-administered and paper-administered tests. Not surprisingly (given the variety of measurement and statistical sampling issues that can affect any one study) the results of such studies have not always been consistent. Moreover, the quality of computer-based test administration systems…
Developing a Test for Assessing Elementary Students' Comprehension of Science Texts

ERIC Educational Resources Information Center

Wang, Jing-Ru; Chen, Shin-Feng; Tsay, Reuy-Fen; Chou, Ching-Ting; Lin, Sheau-Wen; Kao, Huey-Lien

2012-01-01

This study reports on the process of developing a test to assess students' reading comprehension of scientific materials and on the statistical results of the verification study. A combination of classic test theory and item response theory approaches was used to analyze the assessment data from a verification study. Data analysis indicates the…
Enrichment analysis in high-throughput genomics - accounting for dependency in the NULL.

PubMed

Gold, David L; Coombes, Kevin R; Wang, Jing; Mallick, Bani

2007-03-01

Translating the overwhelming amount of data generated in high-throughput genomics experiments into biologically meaningful evidence, which may for example point to a series of biomarkers or hint at a relevant pathway, is a matter of great interest in bioinformatics these days. Genes showing similar experimental profiles, it is hypothesized, share biological mechanisms that if understood could provide clues to the molecular processes leading to pathological events. It is the topic of further study to learn if or how a priori information about the known genes may serve to explain coexpression. One popular method of knowledge discovery in high-throughput genomics experiments, enrichment analysis (EA), seeks to infer if an interesting collection of genes is 'enriched' for a Consortium particular set of a priori Gene Ontology Consortium (GO) classes. For the purposes of statistical testing, the conventional methods offered in EA software implicitly assume independence between the GO classes. Genes may be annotated for more than one biological classification, and therefore the resulting test statistics of enrichment between GO classes can be highly dependent if the overlapping gene sets are relatively large. There is a need to formally determine if conventional EA results are robust to the independence assumption. We derive the exact null distribution for testing enrichment of GO classes by relaxing the independence assumption using well-known statistical theory. In applications with publicly available data sets, our test results are similar to the conventional approach which assumes independence. We argue that the independence assumption is not detrimental.
Impact of syncope on quality of life: validation of a measure in patients undergoing tilt testing.

PubMed

Nave-Leal, Elisabete; Oliveira, Mário; Pais-Ribeiro, José; Santos, Sofia; Oliveira, Eunice; Alves, Teresa; Cruz Ferreira, Rui

2015-03-01

Recurrent syncope has a significant impact on quality of life. The development of measurement scales to assess this impact that are easy to use in clinical settings is crucial. The objective of the present study is a preliminary validation of the Impact of Syncope on Quality of Life questionnaire for the Portuguese population. The instrument underwent a process of translation, validation, analysis of cultural appropriateness and cognitive debriefing. A population of 39 patients with a history of recurrent syncope (>1 year) who underwent tilt testing, aged 52.1 ± 16.4 years (21-83), 43.5% male, most in active employment (n=18) or retired (n=13), constituted a convenience sample. The resulting Portuguese version is similar to the original, with 12 items in a single aggregate score, and underwent statistical validation, with assessment of reliability, validity and stability over time. With regard to reliability, the internal consistency of the scale is 0.9. Assessment of convergent and discriminant validity showed statistically significant results (p<0.01). Regarding stability over time, a test-retest of this instrument at six months after tilt testing with 22 patients of the sample who had not undergone any clinical intervention found no statistically significant changes in quality of life. The results indicate that this instrument is of value for assessing quality of life in patients with recurrent syncope in Portugal. Copyright © 2014 Sociedade Portuguesa de Cardiologia. Published by Elsevier España. All rights reserved.
Incorporation of operator knowledge for improved HMDS GPR classification

NASA Astrophysics Data System (ADS)

Kennedy, Levi; McClelland, Jessee R.; Walters, Joshua R.

2012-06-01

The Husky Mine Detection System (HMDS) detects and alerts operators to potential threats observed in groundpenetrating RADAR (GPR) data. In the current system architecture, the classifiers have been trained using available data from multiple training sites. Changes in target types, clutter types, and operational conditions may result in statistical differences between the training data and the testing data for the underlying features used by the classifier, potentially resulting in an increased false alarm rate or a lower probability of detection for the system. In the current mode of operation, the automated detection system alerts the human operator when a target-like object is detected. The operator then uses data visualization software, contextual information, and human intuition to decide whether the alarm presented is an actual target or a false alarm. When the statistics of the training data and the testing data are mismatched, the automated detection system can overwhelm the analyst with an excessive number of false alarms. This is evident in the performance of and the data collected from deployed systems. This work demonstrates that analyst feedback can be successfully used to re-train a classifier to account for variable testing data statistics not originally captured in the initial training data.
ArrayVigil: a methodology for statistical comparison of gene signatures using segregated-one-tailed (SOT) Wilcoxon's signed-rank test.

PubMed

Khan, Haseeb Ahmad

2005-01-28

Due to versatile diagnostic and prognostic fidelity molecular signatures or fingerprints are anticipated as the most powerful tools for cancer management in the near future. Notwithstanding the experimental advancements in microarray technology, methods for analyzing either whole arrays or gene signatures have not been firmly established. Recently, an algorithm, ArraySolver has been reported by Khan for two-group comparison of microarray gene expression data using two-tailed Wilcoxon signed-rank test. Most of the molecular signatures are composed of two sets of genes (hybrid signatures) wherein up-regulation of one set and down-regulation of the other set collectively define the purpose of a gene signature. Since the direction of a selected gene's expression (positive or negative) with respect to a particular disease condition is known, application of one-tailed statistics could be a more relevant choice. A novel method, ArrayVigil, is described for comparing hybrid signatures using segregated-one-tailed (SOT) Wilcoxon signed-rank test and the results compared with integrated-two-tailed (ITT) procedures (SPSS and ArraySolver). ArrayVigil resulted in lower P values than those obtained from ITT statistics while comparing real data from four signatures.
Vestibular evoked myogenic potentials and digital vectoelectro-nystagmography's study in patients with benign paroxysmal positional vertigo.

PubMed

Maria da Silva Lira-Batista, Marta; Schaffeln Dorigueto, Ricardo; Freitas Ganança, Cristina

2013-04-01

Benign Paroxysmal Positional Vertigo (BPPV) is a very common vestibular disorder characterized by brief but intense attacks of rotatory vertigo triggered by simple rapid movement of the head. The integrity of the vestibular pathways can be assessed using tests such as digital vectoelectronystagmography (VENG) and vestibular evoked myogenic potentials (VEMP). This study aimed to determine the VEMP findings with respect to latency, amplitude, and waveform peak to peak and the results of the oculomotor and vestibular components of VENG in patients with BPPV. Although this otoneurological condition is quite common, little is known of the associated VEMP and VENG changes, making it important to research and describe these results. We examined the records of 4438 patients and selected 35 charts after applying the inclusion and exclusion criteria. Of these, 26 patients were women and 9 men. The average age at diagnosis was 52.7 years, and the most prevalent physiological cause, accounting for 97.3% of cases, was ductolithiasis. There was a statistically significant association between normal hearing and mild contralateral sensorineural hearing loss. The results of the oculomotor tests were within the normal reference ranges for all subjects. Patients with BPPV exhibited symmetrical function of the semicircular canals in their synergistic pairs (p < 0.001). The caloric test showed statistically normal responses from the lateral canals. The waveforms of all patients were adequate, but the VEMP results for the data-crossing maneuver with positive positioning showed a trend toward a relationship for the left ear Lp13. There was also a trend towards an association between normal reflexes in the caloric test and the inter-peak VEMP of the left ear. It can be concluded that although there are some differences between the average levels of the VENG and VEMP results, these differences were not statistically significant. In conclusion, the results of audiologic assessment, hearing thresholds, positioning maneuvers, and caloric tests have no effect on the quantitative results of VEMP. Additional research is warranted to establish the relationships among VENG, VEMP, and BPPV, especially as concerns the oculomotor tests.
The use and misuse of statistical methodologies in pharmacology research.

PubMed

Marino, Michael J

2014-01-01

Descriptive, exploratory, and inferential statistics are necessary components of hypothesis-driven biomedical research. Despite the ubiquitous need for these tools, the emphasis on statistical methods in pharmacology has become dominated by inferential methods often chosen more by the availability of user-friendly software than by any understanding of the data set or the critical assumptions of the statistical tests. Such frank misuse of statistical methodology and the quest to reach the mystical α<0.05 criteria has hampered research via the publication of incorrect analysis driven by rudimentary statistical training. Perhaps more critically, a poor understanding of statistical tools limits the conclusions that may be drawn from a study by divorcing the investigator from their own data. The net result is a decrease in quality and confidence in research findings, fueling recent controversies over the reproducibility of high profile findings and effects that appear to diminish over time. The recent development of "omics" approaches leading to the production of massive higher dimensional data sets has amplified these issues making it clear that new approaches are needed to appropriately and effectively mine this type of data. Unfortunately, statistical education in the field has not kept pace. This commentary provides a foundation for an intuitive understanding of statistics that fosters an exploratory approach and an appreciation for the assumptions of various statistical tests that hopefully will increase the correct use of statistics, the application of exploratory data analysis, and the use of statistical study design, with the goal of increasing reproducibility and confidence in the literature. Copyright © 2013. Published by Elsevier Inc.
Extractive-spectrophotometric determination of disopyramide and irbesartan in their pharmaceutical formulation

NASA Astrophysics Data System (ADS)

Abdellatef, Hisham E.

2007-04-01

Picric acid, bromocresol green, bromothymol blue, cobalt thiocyanate and molybdenum(V) thiocyanate have been tested as spectrophotometric reagents for the determination of disopyramide and irbesartan. Reaction conditions have been optimized to obtain coloured comoplexes of higher sensitivity and longer stability. The absorbance of ion-pair complexes formed were found to increases linearity with increases in concentrations of disopyramide and irbesartan which were corroborated by correction coefficient values. The developed methods have been successfully applied for the determination of disopyramide and irbesartan in bulk drugs and pharmaceutical formulations. The common excipients and additives did not interfere in their determination. The results obtained by the proposed methods have been statistically compared by means of student t-test and by the variance ratio F-test. The validity was assessed by applying the standard addition technique. The results were compared statistically with the official or reference methods showing a good agreement with high precision and accuracy.
Transient evoked oto-acoustic emission screening in newborns in Bogotá, Colombia: a retrospective study.

PubMed

Rojas, Jorge A; Bernal, Jaime E; García, Mary A; Zarante, Ignacio; Ramírez, Natalia; Bernal, Constanza; Gelvez, Nancy; Tamayo, Marta L

2014-10-01

The aim of this study was to investigate the characteristics and performance of transient evoked oto-acoustic emission (TEOAE) hearing screening in newborns in Colombia, and analyze all possible variables and factors affecting the results. An observational, descriptive and retrospective study with bivariate analysis was performed. The study population consisted of 56,822 newborns evaluated at the private institution, PREGEN. TEOAE testing was carried out as a pediatric hearing screening test from December 2003 to March 2012. The database from PREGEN was revised, and the protocol for evaluation included the same screening test performed twice. Demographic characteristics were recorded and the newborn's background was evaluated. Basic statistics of the qualitative and quantitative variables, and statistical analysis were obtained using the chi-square test. Of the 56,822 records examined, 0.28% were classed as abnormal, which corresponded to a prevalence of 1 in 350. In the screened newborns, 0.08% had a major abnormality or other clinical condition diagnosed, and 0.29% reported a family history of hearing loss. A prevalence of 6.7 in 10,000 was obtained for microtia, which is similar to the 6.4 in 10,000 previously reported in Colombia (database of the Latin-American Collaborative Study of Congenital Malformations - ECLAMC). Statistical analysis demonstrated an association between presenting with a major anomaly and a higher frequency of abnormal results on both TEOAE tests. Newborns in Colombia do not currently undergo screening for the early detection of hearing impairment. The results from this study suggest TEOAE screening tests, when performed twice, are able to detect hearing abnormalities in newborns. This highlights the need to improve the long-term evaluation and monitoring of patients in Colombia through diagnostic tests, and to provide tests that are both sensitive and specific. Furthermore, the use of TEOAE screening is justified by the favorable cost: benefit ratio demonstrated in many countries worldwide. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Use of Tests of Statistical Significance and Other Analytic Choices in a School Psychology Journal: Review of Practices and Suggested Alternatives.

ERIC Educational Resources Information Center

Snyder, Patricia A.; Thompson, Bruce

The use of tests of statistical significance was explored, first by reviewing some criticisms of contemporary practice in the use of statistical tests as reflected in a series of articles in the "American Psychologist" and in the appointment of a "Task Force on Statistical Inference" by the American Psychological Association…

A General Class of Test Statistics for Van Valen’s Red Queen Hypothesis

PubMed Central

Wiltshire, Jelani; Huffer, Fred W.; Parker, William C.

2014-01-01

Van Valen’s Red Queen hypothesis states that within a homogeneous taxonomic group the age is statistically independent of the rate of extinction. The case of the Red Queen hypothesis being addressed here is when the homogeneous taxonomic group is a group of similar species. Since Van Valen’s work, various statistical approaches have been used to address the relationship between taxon age and the rate of extinction. We propose a general class of test statistics that can be used to test for the effect of age on the rate of extinction. These test statistics allow for a varying background rate of extinction and attempt to remove the effects of other covariates when assessing the effect of age on extinction. No model is assumed for the covariate effects. Instead we control for covariate effects by pairing or grouping together similar species. Simulations are used to compare the power of the statistics. We apply the test statistics to data on Foram extinctions and find that age has a positive effect on the rate of extinction. A derivation of the null distribution of one of the test statistics is provided in the supplementary material. PMID:24910489
A General Class of Test Statistics for Van Valen's Red Queen Hypothesis.

PubMed

Wiltshire, Jelani; Huffer, Fred W; Parker, William C

2014-09-01

Van Valen's Red Queen hypothesis states that within a homogeneous taxonomic group the age is statistically independent of the rate of extinction. The case of the Red Queen hypothesis being addressed here is when the homogeneous taxonomic group is a group of similar species. Since Van Valen's work, various statistical approaches have been used to address the relationship between taxon age and the rate of extinction. We propose a general class of test statistics that can be used to test for the effect of age on the rate of extinction. These test statistics allow for a varying background rate of extinction and attempt to remove the effects of other covariates when assessing the effect of age on extinction. No model is assumed for the covariate effects. Instead we control for covariate effects by pairing or grouping together similar species. Simulations are used to compare the power of the statistics. We apply the test statistics to data on Foram extinctions and find that age has a positive effect on the rate of extinction. A derivation of the null distribution of one of the test statistics is provided in the supplementary material.
Effect of structural parameters on burning behavior of polyester fabrics having flame retardancy property

NASA Astrophysics Data System (ADS)

Çeven, E. K.; Günaydın, G. K.

2017-10-01

The aim of this study is filling the gap in the literature about investigating the effect of yarn and fabric structural parameters on burning behavior of polyester fabrics. According to the experimental design three different fabric types, three different weft densities and two different weave types were selected and a total of eighteen different polyester drapery fabrics were produced. All statistical procedures were conducted using the SPSS Statistical software package. The results of the Analysis of Variance (ANOVA) tests indicated that; there were statistically significant (5% significance level) differences between the mass loss ratios (%) in weft and mass loss ratios (%) in warp direction of different fabrics calculated after the flammability test. The Student-Newman-Keuls (SNK) results for mass loss ratios (%) both in weft and warp directions revealed that the mass loss ratios (%) of fabrics containing Trevira CS type polyester were lower than the mass loss ratios of polyester fabrics subjected to washing treatment and flame retardancy treatment.
Data-adaptive test statistics for microarray data.

PubMed

Mukherjee, Sach; Roberts, Stephen J; van der Laan, Mark J

2005-09-01

An important task in microarray data analysis is the selection of genes that are differentially expressed between different tissue samples, such as healthy and diseased. However, microarray data contain an enormous number of dimensions (genes) and very few samples (arrays), a mismatch which poses fundamental statistical problems for the selection process that have defied easy resolution. In this paper, we present a novel approach to the selection of differentially expressed genes in which test statistics are learned from data using a simple notion of reproducibility in selection results as the learning criterion. Reproducibility, as we define it, can be computed without any knowledge of the 'ground-truth', but takes advantage of certain properties of microarray data to provide an asymptotically valid guide to expected loss under the true data-generating distribution. We are therefore able to indirectly minimize expected loss, and obtain results substantially more robust than conventional methods. We apply our method to simulated and oligonucleotide array data. By request to the corresponding author.
Statistical Method to Overcome Overfitting Issue in Rational Function Models

NASA Astrophysics Data System (ADS)

Alizadeh Moghaddam, S. H.; Mokhtarzade, M.; Alizadeh Naeini, A.; Alizadeh Moghaddam, S. A.

2017-09-01

Rational function models (RFMs) are known as one of the most appealing models which are extensively applied in geometric correction of satellite images and map production. Overfitting is a common issue, in the case of terrain dependent RFMs, that degrades the accuracy of RFMs-derived geospatial products. This issue, resulting from the high number of RFMs' parameters, leads to ill-posedness of the RFMs. To tackle this problem, in this study, a fast and robust statistical approach is proposed and compared to Tikhonov regularization (TR) method, as a frequently-used solution to RFMs' overfitting. In the proposed method, a statistical test, namely, significance test is applied to search for the RFMs' parameters that are resistant against overfitting issue. The performance of the proposed method was evaluated for two real data sets of Cartosat-1 satellite images. The obtained results demonstrate the efficiency of the proposed method in term of the achievable level of accuracy. This technique, indeed, shows an improvement of 50-80% over the TR.
Quantifying variation in speciation and extinction rates with clade data.

PubMed

Paradis, Emmanuel; Tedesco, Pablo A; Hugueny, Bernard

2013-12-01

High-level phylogenies are very common in evolutionary analyses, although they are often treated as incomplete data. Here, we provide statistical tools to analyze what we name "clade data," which are the ages of clades together with their numbers of species. We develop a general approach for the statistical modeling of variation in speciation and extinction rates, including temporal variation, unknown variation, and linear and nonlinear modeling. We show how this approach can be generalized to a wide range of situations, including testing the effects of life-history traits and environmental variables on diversification rates. We report the results of an extensive simulation study to assess the performance of some statistical tests presented here as well as of the estimators of speciation and extinction rates. These latter results suggest the possibility to estimate correctly extinction rate in the absence of fossils. An example with data on fish is presented. © 2013 The Author(s). Evolution © 2013 The Society for the Study of Evolution.
Dental Composite Restorations and Neuropsychological Development in Children: Treatment Level Analysis from a Randomized Clinical Trial

PubMed Central

Maserejian, Nancy N.; Trachtenberg, Felicia L.; Hauser, Russ; McKinlay, Sonja; Shrader, Peter; Bellinger, David C.

2012-01-01

Background Resin-based dental restorations may intra-orally release their components and bisphenol A. Gestational bisphenol A exposure has been associated with poorer executive functioning in children. Objectives To examine whether exposure to resin-based composite restorations is associated with neuropsychological development in children. Methods Secondary analysis of treatment level data from the New England Children’s Amalgam Trial, a 2-group randomized safety trial conducted from 1997–2006. Children (N=534) aged 6–10 y with >2 posterior tooth caries were randomized to treatment with amalgam or resin-based composites (bisphenol-A-diglycidyl-dimethacrylate-composite for permanent teeth; urethane dimethacrylate-based polyacid-modified compomer for primary teeth). Neuropsychological function at 4- and 5-year follow-up (N=444) was measured by a battery of tests of executive function, intelligence, memory, visual-spatial skills, verbal fluency, and problem-solving. Multivariable generalized linear regression models were used to examine the association between composite exposure levels and changes in neuropsychological test scores from baseline to follow-up. For comparison, data on children randomized to amalgam treatment were similarly analyzed. Results With greater exposure to either dental composite material, results were generally consistent in the direction of slightly poorer changes in tests of intelligence, achievement or memory, but there were no statistically significant associations. For the four primary measures of executive function, scores were slightly worse with greater total composite exposure, but statistically significant only for the test of Letter Fluency (10-surface-years β= −0.8, SE=0.4, P=0.035), and the subtest of color naming (β= −1.5, SE=0.5, P=0.004) in the Stroop Color-Word Interference Test. Multivariate analysis of variance confirmed that the negative associations between composite level and executive function were not statistically significant (MANOVA P=0.18). Results for greater amalgam exposure were mostly nonsignificant in the opposite direction of slightly improved scores over follow-up. Conclusions Dental composite restorations had statistically insignificant associations of small magnitude with impairments in neuropsychological test change scores over 4- or 5-years of follow-up in this trial. PMID:22906860
An evaluation of grease type ball bearing lubricants operating in various environments

NASA Technical Reports Server (NTRS)

Mcmurtrey, E. L.

1984-01-01

Because many future spacecraft or space stations will require mechanisms to operate for long periods of time in environments which are adverse to most bearing lubricants, a series of tests has been completed to evaluate 38 grease type lubricants in R-4 size bearings in five different environments for a 1 year period. Four repetitions of each test were made to provide statistical samples. These tests were also used to select four lubricants for 5 year tests in selected environments with five repetitions of each test for statistical samples. In this completed program, 172 test sets have been completed. The three 5 year tests in: (1) continuous operation and (2) start stop operation, with both in vacuum at ambient temperatures, and (3) continuous vacuum operation at 93.3 C have been completed. In both the 1 year and 5 year tests, the best results in all environments have been obtained with a high viscosity index perfluoroalkylpolyether (PFPE) grease.
The Application of Probabilistic Methods to the Mistuning Problem

NASA Technical Reports Server (NTRS)

Griffin, J. H.; Rossi, M. R.; Feiner, D. M.

2004-01-01

FMM is a reduced order model for efficiently calculating the forced response of a mistuned bladed disk. FMM ID is a companion program which determines the mistuning in a particular rotor. Together, these methods provide a way to acquire data on the mistuning in a population of bladed disks, and then simulate the forced response of the fleet. This process is tested experimentally, and the simulated results are compared with laboratory measurements of a fleet of test rotors. The method is shown to work quite well. It is found that accuracy of the results depends on two factors: the quality of the statistical model used to characterize mistuning, and how sensitive the system is to errors in the statistical modeling.
75 FR 4323 - Additional Quantitative Fit-testing Protocols for the Respiratory Protection Standard

Federal Register 2010, 2011, 2012, 2013, 2014

2010-01-27

... respirators (500 and 1000 for protocols 1 and 2, respectively). However, OSHA could not evaluate the results... the values of these descriptive statistics for revised PortaCount[supreg] QNFT protocols 1 (at RFFs of 100 and 500) and 2 (at RFFs of 200 and 1000). Table 2--Descriptive Statistics for RFFs of 100 and 200...
Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English

ERIC Educational Resources Information Center

Kastner, Itamar; Adriaans, Frans

2018-01-01

Statistical learning is often taken to lie at the heart of many cognitive tasks, including the acquisition of language. One particular task in which probabilistic models have achieved considerable success is the segmentation of speech into words. However, these models have mostly been tested against English data, and as a result little is known…
OSO 8 observational limits to the acoustic coronal heating mechanism

NASA Technical Reports Server (NTRS)

Bruner, E. C., Jr.

1981-01-01

An improved analysis of time-resolved line profiles of the C IV resonance line at 1548 A has been used to test the acoustic wave hypothesis of solar coronal heating. It is shown that the observed motions and brightness fluctuations are consistent with the existence of acoustic waves. Specific account is taken of the effect of photon statistics on the observed velocities, and a test is devised to determine whether the motions represent propagating or evanescent waves. It is found that on the average about as much energy is carried upward as downward such that the net acoustic flux density is statistically consistent with zero. The statistical uncertainty in this null result is three orders of magnitue lower than the flux level needed to heat the corona.
Testing for nonlinearity in time series: The method of surrogate data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Theiler, J.; Galdrikian, B.; Longtin, A.

1991-01-01

We describe a statistical approach for identifying nonlinearity in time series; in particular, we want to avoid claims of chaos when simpler models (such as linearly correlated noise) can explain the data. The method requires a careful statement of the null hypothesis which characterizes a candidate linear process, the generation of an ensemble of surrogate'' data sets which are similar to the original time series but consistent with the null hypothesis, and the computation of a discriminating statistic for the original and for each of the surrogate data sets. The idea is to test the original time series against themore » null hypothesis by checking whether the discriminating statistic computed for the original time series differs significantly from the statistics computed for each of the surrogate sets. We present algorithms for generating surrogate data under various null hypotheses, and we show the results of numerical experiments on artificial data using correlation dimension, Lyapunov exponent, and forecasting error as discriminating statistics. Finally, we consider a number of experimental time series -- including sunspots, electroencephalogram (EEG) signals, and fluid convection -- and evaluate the statistical significance of the evidence for nonlinear structure in each case. 56 refs., 8 figs.« less
Validation of 2 commercial Neospora caninum antibody enzyme linked immunosorbent assays

PubMed Central

Wu, John T.Y.; Dreger, Sally; Chow, Eva Y.W.; Bowlby, Evelyn E.

2002-01-01

Abstract This is a validation study of 2 commercially available enzyme linked immunosorbent assays (ELISA) for the detection of antibodies against Neospora caninum in bovine serum. The results of the reference sera (n = 30) and field sera from an infected beef herd (n = 150) were tested by both ELISAs and the results were compared statistically. When the immunoblotting results of the reference bovine sera were compared to the ELISA results, the same identity score (96.67%) and kappa values (K) (0.93) were obtained for both ELISAs. The sensitivity and specificity values for the IDEXX test were 100% and 93.33% respectively. For the Biovet test 93.33% and 100% were obtained. The corresponding positive (PV+) and negative predictive (PV−) values for the 2 assays were 93.75% and 100% (IDEXX), and 100% and 93.75% (Biovet). In the 2nd study, competitive inhibition ELISA (c-ELISA) results on bovine sera from an infected herd were compared to the 2 sets of ELISA results. The identity scores of the 2 ELISAs were 98% (IDEXX) and 97.33% (Biovet). The K values calculated were 0.96 (IDEXX) and 0.95 (Biovet). For the IDEXX test the sensitivity and specificity were 97.56% and 98.53%, whereas for the Biovet assay 95.12% and 100% were recorded, respectively. The corresponding PV+ and PV− values were 98.77% and 97.1% (IDEXX), and 100% and 94.44% (Biovet). Our validation results showed that the 2 ELISAs worked equally well and there was no statistically significant difference between the performance of the 2 tests. Both tests showed high reproducibility, repeatability and substantial agreement with results from 2 other laboratories. A quality assurance based on the requirement of the ISO/IEC 17025 standards has been adopted throughout this project for test validation procedures. PMID:12418782
A Note on Comparing the Power of Test Statistics at Low Significance Levels.

PubMed

Morris, Nathan; Elston, Robert

2011-01-01

It is an obvious fact that the power of a test statistic is dependent upon the significance (alpha) level at which the test is performed. It is perhaps a less obvious fact that the relative performance of two statistics in terms of power is also a function of the alpha level. Through numerous personal discussions, we have noted that even some competent statisticians have the mistaken intuition that relative power comparisons at traditional levels such as α = 0.05 will be roughly similar to relative power comparisons at very low levels, such as the level α = 5 × 10 -8 , which is commonly used in genome-wide association studies. In this brief note, we demonstrate that this notion is in fact quite wrong, especially with respect to comparing tests with differing degrees of freedom. In fact, at very low alpha levels the cost of additional degrees of freedom is often comparatively low. Thus we recommend that statisticians exercise caution when interpreting the results of power comparison studies which use alpha levels that will not be used in practice.
NASA DOE POD NDE Capabilities Data Book

NASA Technical Reports Server (NTRS)

Generazio, Edward R.

2015-01-01

This data book contains the Directed Design of Experiments for Validating Probability of Detection (POD) Capability of NDE Systems (DOEPOD) analyses of the nondestructive inspection data presented in the NTIAC, Nondestructive Evaluation (NDE) Capabilities Data Book, 3rd ed., NTIAC DB-97-02. DOEPOD is designed as a decision support system to validate inspection system, personnel, and protocol demonstrating 0.90 POD with 95% confidence at critical flaw sizes, a90/95. The test methodology used in DOEPOD is based on the field of statistical sequential analysis founded by Abraham Wald. Sequential analysis is a method of statistical inference whose characteristic feature is that the number of observations required by the procedure is not determined in advance of the experiment. The decision to terminate the experiment depends, at each stage, on the results of the observations previously made. A merit of the sequential method, as applied to testing statistical hypotheses, is that test procedures can be constructed which require, on average, a substantially smaller number of observations than equally reliable test procedures based on a predetermined number of observations.
Detecting trends in raptor counts: power and type I error rates of various statistical tests

USGS Publications Warehouse

Hatfield, J.S.; Gould, W.R.; Hoover, B.A.; Fuller, M.R.; Lindquist, E.L.

1996-01-01

We conducted simulations that estimated power and type I error rates of statistical tests for detecting trends in raptor population count data collected from a single monitoring site. Results of the simulations were used to help analyze count data of bald eagles (Haliaeetus leucocephalus) from 7 national forests in Michigan, Minnesota, and Wisconsin during 1980-1989. Seven statistical tests were evaluated, including simple linear regression on the log scale and linear regression with a permutation test. Using 1,000 replications each, we simulated n = 10 and n = 50 years of count data and trends ranging from -5 to 5% change/year. We evaluated the tests at 3 critical levels (alpha = 0.01, 0.05, and 0.10) for both upper- and lower-tailed tests. Exponential count data were simulated by adding sampling error with a coefficient of variation of 40% from either a log-normal or autocorrelated log-normal distribution. Not surprisingly, tests performed with 50 years of data were much more powerful than tests with 10 years of data. Positive autocorrelation inflated alpha-levels upward from their nominal levels, making the tests less conservative and more likely to reject the null hypothesis of no trend. Of the tests studied, Cox and Stuart's test and Pollard's test clearly had lower power than the others. Surprisingly, the linear regression t-test, Collins' linear regression permutation test, and the nonparametric Lehmann's and Mann's tests all had similar power in our simulations. Analyses of the count data suggested that bald eagles had increasing trends on at least 2 of the 7 national forests during 1980-1989.
Ecological Momentary Assessments and Automated Time Series Analysis to Promote Tailored Health Care: A Proof-of-Principle Study

PubMed Central

Emerencia, Ando C; Bos, Elisabeth H; Rosmalen, Judith GM; Riese, Harriëtte; Aiello, Marco; Sytema, Sjoerd; de Jonge, Peter

2015-01-01

Background Health promotion can be tailored by combining ecological momentary assessments (EMA) with time series analysis. This combined method allows for studying the temporal order of dynamic relationships among variables, which may provide concrete indications for intervention. However, application of this method in health care practice is hampered because analyses are conducted manually and advanced statistical expertise is required. Objective This study aims to show how this limitation can be overcome by introducing automated vector autoregressive modeling (VAR) of EMA data and to evaluate its feasibility through comparisons with results of previously published manual analyses. Methods We developed a Web-based open source application, called AutoVAR, which automates time series analyses of EMA data and provides output that is intended to be interpretable by nonexperts. The statistical technique we used was VAR. AutoVAR tests and evaluates all possible VAR models within a given combinatorial search space and summarizes their results, thereby replacing the researcher’s tasks of conducting the analysis, making an informed selection of models, and choosing the best model. We compared the output of AutoVAR to the output of a previously published manual analysis (n=4). Results An illustrative example consisting of 4 analyses was provided. Compared to the manual output, the AutoVAR output presents similar model characteristics and statistical results in terms of the Akaike information criterion, the Bayesian information criterion, and the test statistic of the Granger causality test. Conclusions Results suggest that automated analysis and interpretation of times series is feasible. Compared to a manual procedure, the automated procedure is more robust and can save days of time. These findings may pave the way for using time series analysis for health promotion on a larger scale. AutoVAR was evaluated using the results of a previously conducted manual analysis. Analysis of additional datasets is needed in order to validate and refine the application for general use. PMID:26254160
Finding differentially expressed genes in high dimensional data: Rank based test statistic via a distance measure.

PubMed

Mathur, Sunil; Sadana, Ajit

2015-12-01

We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set. © The Author(s) 2011.
40 CFR 1054.315 - How do I know when my engine family fails the production-line testing requirements?

Code of Federal Regulations, 2010 CFR

2010-07-01

... and CO emissions: Ci = Max [0 or Ci-1 + Xi−(STD + 0.25 × σ)] Where: Ci = The current CumSum statistic...). Xi = The current emission test result for an individual engine. STD = Emission standard (or family...

Testing variance components by two jackknife methods

USDA-ARS?s Scientific Manuscript database

The jacknife method, a resampling technique, has been widely used for statistical tests for years. The pseudo value based jacknife method (defined as pseudo jackknife method) is commonly used to reduce the bias for an estimate; however, sometimes it could result in large variaion for an estmimate a...
The Chimera of Validity

ERIC Educational Resources Information Center

Baker, Eva L.

2013-01-01

Background/Context: Education policy over the past 40 years has focused on the importance of accountability in school improvement. Although much of the scholarly discourse around testing and assessment is technical and statistical, understanding of validity by a non-specialist audience is essential as long as test results drive our educational…
Signal Processing Methods for Liquid Rocket Engine Combustion Stability Assessments

NASA Technical Reports Server (NTRS)

Kenny, R. Jeremy; Lee, Erik; Hulka, James R.; Casiano, Matthew

2011-01-01

The J2X Gas Generator engine design specifications include dynamic, spontaneous, and broadband combustion stability requirements. These requirements are verified empirically based high frequency chamber pressure measurements and analyses. Dynamic stability is determined with the dynamic pressure response due to an artificial perturbation of the combustion chamber pressure (bomb testing), and spontaneous and broadband stability are determined from the dynamic pressure responses during steady operation starting at specified power levels. J2X Workhorse Gas Generator testing included bomb tests with multiple hardware configurations and operating conditions, including a configuration used explicitly for engine verification test series. This work covers signal processing techniques developed at Marshall Space Flight Center (MSFC) to help assess engine design stability requirements. Dynamic stability assessments were performed following both the CPIA 655 guidelines and a MSFC in-house developed statistical-based approach. The statistical approach was developed to better verify when the dynamic pressure amplitudes corresponding to a particular frequency returned back to pre-bomb characteristics. This was accomplished by first determining the statistical characteristics of the pre-bomb dynamic levels. The pre-bomb statistical characterization provided 95% coverage bounds; these bounds were used as a quantitative measure to determine when the post-bomb signal returned to pre-bomb conditions. The time for post-bomb levels to acceptably return to pre-bomb levels was compared to the dominant frequency-dependent time recommended by CPIA 655. Results for multiple test configurations, including stable and unstable configurations, were reviewed. Spontaneous stability was assessed using two processes: 1) characterization of the ratio of the peak response amplitudes to the excited chamber acoustic mode amplitudes and 2) characterization of the variability of the peak response's frequency over the test duration. This characterization process assists in evaluating the discreteness of a signal as well as the stability of the chamber response. Broadband stability was assessed using a running root-mean-square evaluation. These techniques were also employed, in a comparative analysis, on available Fastrac data, and these results are presented here.
An Analytic Solution to the Computation of Power and Sample Size for Genetic Association Studies under a Pleiotropic Mode of Inheritance.

PubMed

Gordon, Derek; Londono, Douglas; Patel, Payal; Kim, Wonkuk; Finch, Stephen J; Heiman, Gary A

2016-01-01

Our motivation here is to calculate the power of 3 statistical tests used when there are genetic traits that operate under a pleiotropic mode of inheritance and when qualitative phenotypes are defined by use of thresholds for the multiple quantitative phenotypes. Specifically, we formulate a multivariate function that provides the probability that an individual has a vector of specific quantitative trait values conditional on having a risk locus genotype, and we apply thresholds to define qualitative phenotypes (affected, unaffected) and compute penetrances and conditional genotype frequencies based on the multivariate function. We extend the analytic power and minimum-sample-size-necessary (MSSN) formulas for 2 categorical data-based tests (genotype, linear trend test [LTT]) of genetic association to the pleiotropic model. We further compare the MSSN of the genotype test and the LTT with that of a multivariate ANOVA (Pillai). We approximate the MSSN for statistics by linear models using a factorial design and ANOVA. With ANOVA decomposition, we determine which factors most significantly change the power/MSSN for all statistics. Finally, we determine which test statistics have the smallest MSSN. In this work, MSSN calculations are for 2 traits (bivariate distributions) only (for illustrative purposes). We note that the calculations may be extended to address any number of traits. Our key findings are that the genotype test usually has lower MSSN requirements than the LTT. More inclusive thresholds (top/bottom 25% vs. top/bottom 10%) have higher sample size requirements. The Pillai test has a much larger MSSN than both the genotype test and the LTT, as a result of sample selection. With these formulas, researchers can specify how many subjects they must collect to localize genes for pleiotropic phenotypes. © 2017 S. Karger AG, Basel.
Linearised and non-linearised isotherm models optimization analysis by error functions and statistical means

PubMed Central

2014-01-01

In adsorption study, to describe sorption process and evaluation of best-fitting isotherm model is a key analysis to investigate the theoretical hypothesis. Hence, numerous statistically analysis have been extensively used to estimate validity of the experimental equilibrium adsorption values with the predicted equilibrium values. Several statistical error analysis were carried out. In the present study, the following statistical analysis were carried out to evaluate the adsorption isotherm model fitness, like the Pearson correlation, the coefficient of determination and the Chi-square test, have been used. The ANOVA test was carried out for evaluating significance of various error functions and also coefficient of dispersion were evaluated for linearised and non-linearised models. The adsorption of phenol onto natural soil (Local name Kalathur soil) was carried out, in batch mode at 30 ± 20 C. For estimating the isotherm parameters, to get a holistic view of the analysis the models were compared between linear and non-linear isotherm models. The result reveled that, among above mentioned error functions and statistical functions were designed to determine the best fitting isotherm. PMID:25018878
Effect of Internet-Based Cognitive Apprenticeship Model (i-CAM) on Statistics Learning among Postgraduate Students.

PubMed

Saadati, Farzaneh; Ahmad Tarmizi, Rohani; Mohd Ayub, Ahmad Fauzi; Abu Bakar, Kamariah

2015-01-01

Because students' ability to use statistics, which is mathematical in nature, is one of the concerns of educators, embedding within an e-learning system the pedagogical characteristics of learning is 'value added' because it facilitates the conventional method of learning mathematics. Many researchers emphasize the effectiveness of cognitive apprenticeship in learning and problem solving in the workplace. In a cognitive apprenticeship learning model, skills are learned within a community of practitioners through observation of modelling and then practice plus coaching. This study utilized an internet-based Cognitive Apprenticeship Model (i-CAM) in three phases and evaluated its effectiveness for improving statistics problem-solving performance among postgraduate students. The results showed that, when compared to the conventional mathematics learning model, the i-CAM could significantly promote students' problem-solving performance at the end of each phase. In addition, the combination of the differences in students' test scores were considered to be statistically significant after controlling for the pre-test scores. The findings conveyed in this paper confirmed the considerable value of i-CAM in the improvement of statistics learning for non-specialized postgraduate students.
Characterizing the Joint Effect of Diverse Test-Statistic Correlation Structures and Effect Size on False Discovery Rates in a Multiple-Comparison Study of Many Outcome Measures

NASA Technical Reports Server (NTRS)

Feiveson, Alan H.; Ploutz-Snyder, Robert; Fiedler, James

2011-01-01

In their 2009 Annals of Statistics paper, Gavrilov, Benjamini, and Sarkar report the results of a simulation assessing the robustness of their adaptive step-down procedure (GBS) for controlling the false discovery rate (FDR) when normally distributed test statistics are serially correlated. In this study we extend the investigation to the case of multiple comparisons involving correlated non-central t-statistics, in particular when several treatments or time periods are being compared to a control in a repeated-measures design with many dependent outcome measures. In addition, we consider several dependence structures other than serial correlation and illustrate how the FDR depends on the interaction between effect size and the type of correlation structure as indexed by Foerstner s distance metric from an identity. The relationship between the correlation matrix R of the original dependent variables and R, the correlation matrix of associated t-statistics is also studied. In general R depends not only on R, but also on sample size and the signed effect sizes for the multiple comparisons.
Semiquantitative determination of mesophilic, aerobic microorganisms in cocoa products using the Soleris NF-TVC method.

PubMed

Montei, Carolyn; McDougal, Susan; Mozola, Mark; Rice, Jennifer

2014-01-01

The Soleris Non-fermenting Total Viable Count method was previously validated for a wide variety of food products, including cocoa powder. A matrix extension study was conducted to validate the method for use with cocoa butter and cocoa liquor. Test samples included naturally contaminated cocoa liquor and cocoa butter inoculated with natural microbial flora derived from cocoa liquor. A probability of detection statistical model was used to compare Soleris results at multiple test thresholds (dilutions) with aerobic plate counts determined using the AOAC Official Method 966.23 dilution plating method. Results of the two methods were not statistically different at any dilution level in any of the three trials conducted. The Soleris method offers the advantage of results within 24 h, compared to the 48 h required by standard dilution plating methods.
Interrelationship of mechanical and corrosion-mechanical characteristics of type 12KhN4MF steel

DOE Office of Scientific and Technical Information (OSTI.GOV)

Voronin, V.P.; Goncharov, A.F.; Maslov, V.A.

1985-11-01

Investigations presented include a comparative evaluation of the corrosionmechanical characteristics of specimens of high-strength chrome-nickelmolybdenum steel taking into consideration the different methods of melting of the original metal. A comparison of the corrosion-mechanical test results obtained with the results of acceptance tests are presented. A study of the fracture surfaces and the specimen material with the use of fractographic, macroscopic, and microscopic analyses is given. The systematization of the corrosion-mechanical test results with the use of methods of mathematical statistics are presented.
Conditional statistical inference with multistage testing designs.

PubMed

Zwitser, Robert J; Maris, Gunter

2015-03-01

In this paper it is demonstrated how statistical inference from multistage test designs can be made based on the conditional likelihood. Special attention is given to parameter estimation, as well as the evaluation of model fit. Two reasons are provided why the fit of simple measurement models is expected to be better in adaptive designs, compared to linear designs: more parameters are available for the same number of observations; and undesirable response behavior, like slipping and guessing, might be avoided owing to a better match between item difficulty and examinee proficiency. The results are illustrated with simulated data, as well as with real data.
Modified Distribution-Free Goodness-of-Fit Test Statistic.

PubMed

Chun, So Yeon; Browne, Michael W; Shapiro, Alexander

2018-03-01

Covariance structure analysis and its structural equation modeling extensions have become one of the most widely used methodologies in social sciences such as psychology, education, and economics. An important issue in such analysis is to assess the goodness of fit of a model under analysis. One of the most popular test statistics used in covariance structure analysis is the asymptotically distribution-free (ADF) test statistic introduced by Browne (Br J Math Stat Psychol 37:62-83, 1984). The ADF statistic can be used to test models without any specific distribution assumption (e.g., multivariate normal distribution) of the observed data. Despite its advantage, it has been shown in various empirical studies that unless sample sizes are extremely large, this ADF statistic could perform very poorly in practice. In this paper, we provide a theoretical explanation for this phenomenon and further propose a modified test statistic that improves the performance in samples of realistic size. The proposed statistic deals with the possible ill-conditioning of the involved large-scale covariance matrices.
Analysis of Acoustic Emission Parameters from Corrosion of AST Bottom Plate in Field Testing

NASA Astrophysics Data System (ADS)

Jomdecha, C.; Jirarungsatian, C.; Suwansin, W.

Field testing of aboveground storage tank (AST) to monitor corrosion of the bottom plate is presented in this chapter. AE testing data of the ten AST with different sizes, materials, and products were employed to monitor the bottom plate condition. AE sensors of 30 and 150 kHz were used to monitor the corrosion activity of up to 24 channels including guard sensors. Acoustic emission (AE) parameters were analyzed to explore the AE parameter patterns of occurring corrosion compared to the laboratory results. Amplitude, count, duration, and energy were main parameters of analysis. Pattern recognition technique with statistical was implemented to eliminate the electrical and environmental noises. The results showed the specific AE patterns of corrosion activities related to the empirical results. In addition, plane algorithm was utilized to locate the significant AE events from corrosion. Both results of parameter patterns and AE event locations can be used to interpret and locate the corrosion activities. Finally, basic statistical grading technique was used to evaluate the bottom plate condition of the AST.
Is It Necessary to Make Anchor Tests Mini-Versions of the Tests Being Equated or Can Some Restrictions Be Relaxed?

ERIC Educational Resources Information Center

Sinharay, Sandip; Holland, Paul W.

2007-01-01

It is a widely held belief that anchor tests should be miniature versions (i.e., "minitests"), with respect to content and statistical characteristics, of the tests being equated. This article examines the foundations for this belief regarding statistical characteristics. It examines the requirement of statistical representativeness of…
Which statistics should tropical biologists learn?

PubMed

Loaiza Velásquez, Natalia; González Lutz, María Isabel; Monge-Nájera, Julián

2011-09-01

Tropical biologists study the richest and most endangered biodiversity in the planet, and in these times of climate change and mega-extinctions, the need for efficient, good quality research is more pressing than in the past. However, the statistical component in research published by tropical authors sometimes suffers from poor quality in data collection; mediocre or bad experimental design and a rigid and outdated view of data analysis. To suggest improvements in their statistical education, we listed all the statistical tests and other quantitative analyses used in two leading tropical journals, the Revista de Biología Tropical and Biotropica, during a year. The 12 most frequent tests in the articles were: Analysis of Variance (ANOVA), Chi-Square Test, Student's T Test, Linear Regression, Pearson's Correlation Coefficient, Mann-Whitney U Test, Kruskal-Wallis Test, Shannon's Diversity Index, Tukey's Test, Cluster Analysis, Spearman's Rank Correlation Test and Principal Component Analysis. We conclude that statistical education for tropical biologists must abandon the old syllabus based on the mathematical side of statistics and concentrate on the correct selection of these and other procedures and tests, on their biological interpretation and on the use of reliable and friendly freeware. We think that their time will be better spent understanding and protecting tropical ecosystems than trying to learn the mathematical foundations of statistics: in most cases, a well designed one-semester course should be enough for their basic requirements.
Statistical complexity measure of pseudorandom bit generators

NASA Astrophysics Data System (ADS)

González, C. M.; Larrondo, H. A.; Rosso, O. A.

2005-08-01

Pseudorandom number generators (PRNG) are extensively used in Monte Carlo simulations, gambling machines and cryptography as substitutes of ideal random number generators (RNG). Each application imposes different statistical requirements to PRNGs. As L’Ecuyer clearly states “the main goal for Monte Carlo methods is to reproduce the statistical properties on which these methods are based whereas for gambling machines and cryptology, observing the sequence of output values for some time should provide no practical advantage for predicting the forthcoming numbers better than by just guessing at random”. In accordance with different applications several statistical test suites have been developed to analyze the sequences generated by PRNGs. In a recent paper a new statistical complexity measure [Phys. Lett. A 311 (2003) 126] has been defined. Here we propose this measure, as a randomness quantifier of a PRNGs. The test is applied to three very well known and widely tested PRNGs available in the literature. All of them are based on mathematical algorithms. Another PRNGs based on Lorenz 3D chaotic dynamical system is also analyzed. PRNGs based on chaos may be considered as a model for physical noise sources and important new results are recently reported. All the design steps of this PRNG are described, and each stage increase the PRNG randomness using different strategies. It is shown that the MPR statistical complexity measure is capable to quantify this randomness improvement. The PRNG based on the chaotic 3D Lorenz dynamical system is also evaluated using traditional digital signal processing tools for comparison.
Analysis of the Level of Dysphagia, Anxiety, and Nutritional Status Before and After Speech Therapy in Patients with Stroke

PubMed Central

Drozdz, Daniela; Mancopes, Renata; Silva, Ana Maria Toniolo; Reppold, Caroline

2014-01-01

Introduction: The rehabilitation in oropharyngeal dysphagia evidence-based implies the relationship between the interventions and their results. Objective: Analyze level of dysphagia, oral ingestion, anxiety levels and nutritional status of patients with stroke diagnosis, before and after speech therapy. Method: Clinical assessment of dysphagia partially using the Protocol of Risk Assessment for Dysphagia (PARD), applying the scale Functional Oral Intake Scale for Dysphagia in Stroke Patients (FOIS), Beck Anxiety Inventory (BAI) and the Mini Nutritional Assessment MNA®. The sample consisted of 12 patients, mean age of 64.6 years, with a medical diagnosis of hemorrhagic and ischemic stroke and without cognitive disorders. All tests were applied before and after speech therapy (15 sessions). Statistical analysis was performed using the chi-square test or Fisher's exact test, McNemar's test, Bowker's symmetry test and Wilcoxon's test. Results: During the pre-speech therapy assessments, 33.3% of patients had mild to moderate dysphagia, 88.2% did not receive food orally, 47.1% of the patients showed malnutrition and 35.3% of patients had mild anxiety level. After the therapy sessions, it was found that 33.3% of patients had mild dysphagia, 16.7% were malnourished and 50% of patients had minimal level of anxiety. Conclusion: There were statistically significant evolution of the level of dysphagia (p = 0.017) and oral intake (p = 0.003) post-speech therapy. Although not statistically significant, there was considerable progress in relation to the level of anxiety and nutritional status. PMID:25992086
Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

PubMed

Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

2017-04-01

Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
Statistical evidence for common ancestry: Application to primates.

PubMed

Baum, David A; Ané, Cécile; Larget, Bret; Solís-Lemus, Claudia; Ho, Lam Si Tung; Boone, Peggy; Drummond, Chloe P; Bontrager, Martin; Hunter, Steven J; Saucier, William

2016-06-01

Since Darwin, biologists have come to recognize that the theory of descent from common ancestry (CA) is very well supported by diverse lines of evidence. However, while the qualitative evidence is overwhelming, we also need formal methods for quantifying the evidential support for CA over the alternative hypothesis of separate ancestry (SA). In this article, we explore a diversity of statistical methods using data from the primates. We focus on two alternatives to CA, species SA (the separate origin of each named species) and family SA (the separate origin of each family). We implemented statistical tests based on morphological, molecular, and biogeographic data and developed two new methods: one that tests for phylogenetic autocorrelation while correcting for variation due to confounding ecological traits and a method for examining whether fossil taxa have fewer derived differences than living taxa. We overwhelmingly rejected both species and family SA with infinitesimal P values. We compare these results with those from two companion papers, which also found tremendously strong support for the CA of all primates, and discuss future directions and general philosophical issues that pertain to statistical testing of historical hypotheses such as CA. © 2016 The Author(s). Evolution © 2016 The Society for the Study of Evolution.
Higher certainty of the laser-induced damage threshold test with a redistributing data treatment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jensen, Lars; Mrohs, Marius; Gyamfi, Mark

2015-10-15

As a consequence of its statistical nature, the measurement of the laser-induced damage threshold holds always risks to over- or underestimate the real threshold value. As one of the established measurement procedures, the results of S-on-1 (and 1-on-1) tests outlined in the corresponding ISO standard 21 254 depend on the amount of data points and their distribution over the fluence scale. With the limited space on a test sample as well as the requirements on test site separation and beam sizes, the amount of data from one test is restricted. This paper reports on a way to treat damage testmore » data in order to reduce the statistical error and therefore measurement uncertainty. Three simple assumptions allow for the assignment of one data point to multiple data bins and therefore virtually increase the available data base.« less
One-minute heart rate recovery after cycloergometer exercise testing as a predictor of mortality in a large cohort of exercise test candidates: substantial differences with the treadmill-derived parameter.

PubMed

Gaibazzi, Nicola; Petrucci, Nicola; Ziacchi, Vigilio

2004-03-01

Previous work showed a strong inverse association between 1-min heart rate recovery (HRR) after exercising on a treadmill and all-cause mortality. The aim of this study was to determine whether the results could be replicated in a wide population of real-world exercise ECG candidates in our center, using a standard bicycle exercise test. Between 1991 and 1997, 1420 consecutive patients underwent ECG exercise testing performed according to our standard cycloergometer protocol. Three pre-specified cut-point values of 1-min HRR, derived from previous studies in the medical literature, were tested to see whether they could identify a higher-risk group for all-cause mortality; furthermore, we tested the possible association between 1-min HRR as a continuous variable and mortality using logistic regression. Both methods showed a lack of a statistically significant association between 1-min HRR and all-cause mortality. A weak trend toward an inverse association, although not statistically significant, could not be excluded. We could not validate the clear-cut results from some previous studies performed using the treadmill exercise test. The results in our study may only "not exclude" a mild inverse association between 1-min HRR measured after cycloergometer exercise testing and all-cause mortality. The 1-min HRR measured after cycloergometer exercise testing was not clinically useful as a prognostic marker.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.