Chou, C P; Bentler, P M; Satorra, A
1991-11-01
Research studying robustness of maximum likelihood (ML) statistics in covariance structure analysis has concluded that test statistics and standard errors are biased under severe non-normality. An estimation procedure known as asymptotic distribution free (ADF), making no distributional assumption, has been suggested to avoid these biases. Corrections to the normal theory statistics to yield more adequate performance have also been proposed. This study compares the performance of a scaled test statistic and robust standard errors for two models under several non-normal conditions and also compares these with the results from ML and ADF methods. Both ML and ADF test statistics performed rather well in one model and considerably worse in the other. In general, the scaled test statistic seemed to behave better than the ML test statistic and the ADF statistic performed the worst. The robust and ADF standard errors yielded more appropriate estimates of sampling variability than the ML standard errors, which were usually downward biased, in both models under most of the non-normal conditions. ML test statistics and standard errors were found to be quite robust to the violation of the normality assumption when data had either symmetric and platykurtic distributions, or non-symmetric and zero kurtotic distributions.
Standard Errors and Confidence Intervals of Norm Statistics for Educational and Psychological Tests.
Oosterhuis, Hannah E M; van der Ark, L Andries; Sijtsma, Klaas
2016-11-14
Norm statistics allow for the interpretation of scores on psychological and educational tests, by relating the test score of an individual test taker to the test scores of individuals belonging to the same gender, age, or education groups, et cetera. Given the uncertainty due to sampling error, one would expect researchers to report standard errors for norm statistics. In practice, standard errors are seldom reported; they are either unavailable or derived under strong distributional assumptions that may not be realistic for test scores. We derived standard errors for four norm statistics (standard deviation, percentile ranks, stanine boundaries and Z-scores) under the mild assumption that the test scores are multinomially distributed. A simulation study showed that the standard errors were unbiased and that corresponding Wald-based confidence intervals had good coverage. Finally, we discuss the possibilities for applying the standard errors in practical test use in education and psychology. The procedure is provided via the R function check.norms, which is available in the mokken package.
Ensuring Positiveness of the Scaled Difference Chi-square Test Statistic.
Satorra, Albert; Bentler, Peter M
2010-06-01
A scaled difference test statistic [Formula: see text] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (2001). The statistic [Formula: see text] is asymptotically equivalent to the scaled difference test statistic T̄(d) introduced in Satorra (2000), which requires more involved computations beyond standard output of SEM software. The test statistic [Formula: see text] has been widely used in practice, but in some applications it is negative due to negativity of its associated scaling correction. Using the implicit function theorem, this note develops an improved scaling correction leading to a new scaled difference statistic T̄(d) that avoids negative chi-square values.
Test 6, Test 7, and Gas Standard Analysis Results
NASA Technical Reports Server (NTRS)
Perez, Horacio, III
2007-01-01
This viewgraph presentation shows results of analyses on odor, toxic off gassing and gas standards. The topics include: 1) Statistical Analysis Definitions; 2) Odor Analysis Results NASA Standard 6001 Test 6; 3) Toxic Off gassing Analysis Results NASA Standard 6001 Test 7; and 4) Gas Standard Results NASA Standard 6001 Test 7;
An entropy-based statistic for genomewide association studies.
Zhao, Jinying; Boerwinkle, Eric; Xiong, Momiao
2005-07-01
Efficient genotyping methods and the availability of a large collection of single-nucleotide polymorphisms provide valuable tools for genetic studies of human disease. The standard chi2 statistic for case-control studies, which uses a linear function of allele frequencies, has limited power when the number of marker loci is large. We introduce a novel test statistic for genetic association studies that uses Shannon entropy and a nonlinear function of allele frequencies to amplify the differences in allele and haplotype frequencies to maintain statistical power with large numbers of marker loci. We investigate the relationship between the entropy-based test statistic and the standard chi2 statistic and show that, in most cases, the power of the entropy-based statistic is greater than that of the standard chi2 statistic. The distribution of the entropy-based statistic and the type I error rates are validated using simulation studies. Finally, we apply the new entropy-based test statistic to two real data sets, one for the COMT gene and schizophrenia and one for the MMP-2 gene and esophageal carcinoma, to evaluate the performance of the new method for genetic association studies. The results show that the entropy-based statistic obtained smaller P values than did the standard chi2 statistic.
Detection of Person Misfit in Computerized Adaptive Tests with Polytomous Items.
ERIC Educational Resources Information Center
van Krimpen-Stoop, Edith M. L. A.; Meijer, Rob R.
2002-01-01
Compared the nominal and empirical null distributions of the standardized log-likelihood statistic for polytomous items for paper-and-pencil (P&P) and computerized adaptive tests (CATs). Results show that the empirical distribution of the statistic differed from the assumed standard normal distribution for both P&P tests and CATs. Also…
A standard for test reliability in group research.
Ellis, Jules L
2013-03-01
Many authors adhere to the rule that test reliabilities should be at least .70 or .80 in group research. This article introduces a new standard according to which reliabilities can be evaluated. This standard is based on the costs or time of the experiment and of administering the test. For example, if test administration costs are 7 % of the total experimental costs, the efficient value of the reliability is .93. If the actual reliability of a test is equal to this efficient reliability, the test size maximizes the statistical power of the experiment, given the costs. As a standard in experimental research, it is proposed that the reliability of the dependent variable be close to the efficient reliability. Adhering to this standard will enhance the statistical power and reduce the costs of experiments.
Ensuring Positiveness of the Scaled Difference Chi-Square Test Statistic
ERIC Educational Resources Information Center
Satorra, Albert; Bentler, Peter M.
2010-01-01
A scaled difference test statistic T[tilde][subscript d] that can be computed from standard software of structural equation models (SEM) by hand calculations was proposed in Satorra and Bentler (Psychometrika 66:507-514, 2001). The statistic T[tilde][subscript d] is asymptotically equivalent to the scaled difference test statistic T[bar][subscript…
Analysis of statistical misconception in terms of statistical reasoning
NASA Astrophysics Data System (ADS)
Maryati, I.; Priatna, N.
2018-05-01
Reasoning skill is needed for everyone to face globalization era, because every person have to be able to manage and use information from all over the world which can be obtained easily. Statistical reasoning skill is the ability to collect, group, process, interpret, and draw conclusion of information. Developing this skill can be done through various levels of education. However, the skill is low because many people assume that statistics is just the ability to count and using formulas and so do students. Students still have negative attitude toward course which is related to research. The purpose of this research is analyzing students’ misconception in descriptive statistic course toward the statistical reasoning skill. The observation was done by analyzing the misconception test result and statistical reasoning skill test; observing the students’ misconception effect toward statistical reasoning skill. The sample of this research was 32 students of math education department who had taken descriptive statistic course. The mean value of misconception test was 49,7 and standard deviation was 10,6 whereas the mean value of statistical reasoning skill test was 51,8 and standard deviation was 8,5. If the minimal value is 65 to state the standard achievement of a course competence, students’ mean value is lower than the standard competence. The result of students’ misconception study emphasized on which sub discussion that should be considered. Based on the assessment result, it was found that students’ misconception happen on this: 1) writing mathematical sentence and symbol well, 2) understanding basic definitions, 3) determining concept that will be used in solving problem. In statistical reasoning skill, the assessment was done to measure reasoning from: 1) data, 2) representation, 3) statistic format, 4) probability, 5) sample, and 6) association.
Langley Wind Tunnel Data Quality Assurance-Check Standard Results
NASA Technical Reports Server (NTRS)
Hemsch, Michael J.; Grubb, John P.; Krieger, William B.; Cler, Daniel L.
2000-01-01
A framework for statistical evaluation, control and improvement of wind funnel measurement processes is presented The methodology is adapted from elements of the Measurement Assurance Plans developed by the National Bureau of Standards (now the National Institute of Standards and Technology) for standards and calibration laboratories. The present methodology is based on the notions of statistical quality control (SQC) together with check standard testing and a small number of customer repeat-run sets. The results of check standard and customer repeat-run -sets are analyzed using the statistical control chart-methods of Walter A. Shewhart long familiar to the SQC community. Control chart results are presented for. various measurement processes in five facilities at Langley Research Center. The processes include test section calibration, force and moment measurements with a balance, and instrument calibration.
75 FR 53925 - Sea Turtle Conservation; Shrimp and Summer Flounder Trawling Requirements
Federal Register 2010, 2011, 2012, 2013, 2014
2010-09-02
... because of the statistical probability the candidate TED may not achieve the standard (i.e., control TED... the test with 4 turtle captures because of the statistical probability the candidate TED may not... because of the statistical probability the candidate TED may not achieve the standard (i.e., [[Page 53930...
40 CFR 1065.12 - Approval of alternate procedures.
Code of Federal Regulations, 2010 CFR
2010-07-01
... engine meets all applicable emission standards according to specified procedures. (iii) Use statistical.... (e) We may give you specific directions regarding methods for statistical analysis, or we may approve... statistical tests. Perform the tests as follows: (1) Repeat measurements for all applicable duty cycles at...
Brown, Geoffrey W.; Sandstrom, Mary M.; Preston, Daniel N.; ...
2014-11-17
In this study, the Integrated Data Collection Analysis (IDCA) program has conducted a proficiency test for small-scale safety and thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results from this test for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Class 5 Type II standard. The material was tested as a well-characterized standard several times during the proficiency test to assess differences among participants and the range of results that may arise for well-behaved explosive materials.
Testing the statistical compatibility of independent data sets
NASA Astrophysics Data System (ADS)
Maltoni, M.; Schwetz, T.
2003-08-01
We discuss a goodness-of-fit method which tests the compatibility between statistically independent data sets. The method gives sensible results even in cases where the χ2 minima of the individual data sets are very low or when several parameters are fitted to a large number of data points. In particular, it avoids the problem that a possible disagreement between data sets becomes diluted by data points which are insensitive to the crucial parameters. A formal derivation of the probability distribution function for the proposed test statistics is given, based on standard theorems of statistics. The application of the method is illustrated on data from neutrino oscillation experiments, and its complementarity to the standard goodness-of-fit is discussed.
Testing for independence in J×K contingency tables with complex sample survey data.
Lipsitz, Stuart R; Fitzmaurice, Garrett M; Sinha, Debajyoti; Hevelone, Nathanael; Giovannucci, Edward; Hu, Jim C
2015-09-01
The test of independence of row and column variables in a (J×K) contingency table is a widely used statistical test in many areas of application. For complex survey samples, use of the standard Pearson chi-squared test is inappropriate due to correlation among units within the same cluster. Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) proposed an approach in which the standard Pearson chi-squared statistic is multiplied by a design effect to adjust for the complex survey design. Unfortunately, this test fails to exist when one of the observed cell counts equals zero. Even with the large samples typical of many complex surveys, zero cell counts can occur for rare events, small domains, or contingency tables with a large number of cells. Here, we propose Wald and score test statistics for independence based on weighted least squares estimating equations. In contrast to the Rao-Scott test statistic, the proposed Wald and score test statistics always exist. In simulations, the score test is found to perform best with respect to type I error. The proposed method is motivated by, and applied to, post surgical complications data from the United States' Nationwide Inpatient Sample (NIS) complex survey of hospitals in 2008. © 2015, The International Biometric Society.
Null but not void: considerations for hypothesis testing.
Shaw, Pamela A; Proschan, Michael A
2013-01-30
Standard statistical theory teaches us that once the null and alternative hypotheses have been defined for a parameter, the choice of the statistical test is clear. Standard theory does not teach us how to choose the null or alternative hypothesis appropriate to the scientific question of interest. Neither does it tell us that in some cases, depending on which alternatives are realistic, we may want to define our null hypothesis differently. Problems in statistical practice are frequently not as pristinely summarized as the classic theory in our textbooks. In this article, we present examples in statistical hypothesis testing in which seemingly simple choices are in fact rich with nuance that, when given full consideration, make the choice of the right hypothesis test much less straightforward. Published 2012. This article is a US Government work and is in the public domain in the USA.
Comparing Simulated and Theoretical Sampling Distributions of the U3 Person-Fit Statistic.
ERIC Educational Resources Information Center
Emons, Wilco H. M.; Meijer, Rob R.; Sijtsma, Klaas
2002-01-01
Studied whether the theoretical sampling distribution of the U3 person-fit statistic is in agreement with the simulated sampling distribution under different item response theory models and varying item and test characteristics. Simulation results suggest that the use of standard normal deviates for the standardized version of the U3 statistic may…
Design of experiments enhanced statistical process control for wind tunnel check standard testing
NASA Astrophysics Data System (ADS)
Phillips, Ben D.
The current wind tunnel check standard testing program at NASA Langley Research Center is focused on increasing data quality, uncertainty quantification and overall control and improvement of wind tunnel measurement processes. The statistical process control (SPC) methodology employed in the check standard testing program allows for the tracking of variations in measurements over time as well as an overall assessment of facility health. While the SPC approach can and does provide researchers with valuable information, it has certain limitations in the areas of process improvement and uncertainty quantification. It is thought by utilizing design of experiments methodology in conjunction with the current SPC practices that one can efficiently and more robustly characterize uncertainties and develop enhanced process improvement procedures. In this research, methodologies were developed to generate regression models for wind tunnel calibration coefficients, balance force coefficients and wind tunnel flow angularities. The coefficients of these regression models were then tracked in statistical process control charts, giving a higher level of understanding of the processes. The methodology outlined is sufficiently generic such that this research can be applicable to any wind tunnel check standard testing program.
Experimental control in software reliability certification
NASA Technical Reports Server (NTRS)
Trammell, Carmen J.; Poore, Jesse H.
1994-01-01
There is growing interest in software 'certification', i.e., confirmation that software has performed satisfactorily under a defined certification protocol. Regulatory agencies, customers, and prospective reusers all want assurance that a defined product standard has been met. In other industries, products are typically certified under protocols in which random samples of the product are drawn, tests characteristic of operational use are applied, analytical or statistical inferences are made, and products meeting a standard are 'certified' as fit for use. A warranty statement is often issued upon satisfactory completion of a certification protocol. This paper outlines specific engineering practices that must be used to preserve the validity of the statistical certification testing protocol. The assumptions associated with a statistical experiment are given, and their implications for statistical testing of software are described.
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
ERIC Educational Resources Information Center
Lord, Frederic M.; Stocking, Martha
A general Computer program is described that will compute asymptotic standard errors and carry out significance tests for an endless variety of (standard and) nonstandard large-sample statistical problems, without requiring the statistician to derive asymptotic standard error formulas. The program assumes that the observations have a multinormal…
Power of tests for comparing trend curves with application to national immunization survey (NIS).
Zhao, Zhen
2011-02-28
To develop statistical tests for comparing trend curves of study outcomes between two socio-demographic strata across consecutive time points, and compare statistical power of the proposed tests under different trend curves data, three statistical tests were proposed. For large sample size with independent normal assumption among strata and across consecutive time points, the Z and Chi-square test statistics were developed, which are functions of outcome estimates and the standard errors at each of the study time points for the two strata. For small sample size with independent normal assumption, the F-test statistic was generated, which is a function of sample size of the two strata and estimated parameters across study period. If two trend curves are approximately parallel, the power of Z-test is consistently higher than that of both Chi-square and F-test. If two trend curves cross at low interaction, the power of Z-test is higher than or equal to the power of both Chi-square and F-test; however, at high interaction, the powers of Chi-square and F-test are higher than that of Z-test. The measurement of interaction of two trend curves was defined. These tests were applied to the comparison of trend curves of vaccination coverage estimates of standard vaccine series with National Immunization Survey (NIS) 2000-2007 data. Copyright © 2011 John Wiley & Sons, Ltd.
Applying a statistical PTB detection procedure to complement the gold standard.
Noor, Norliza Mohd; Yunus, Ashari; Bakar, S A R Abu; Hussin, Amran; Rijal, Omar Mohd
2011-04-01
This paper investigates a novel statistical discrimination procedure to detect PTB when the gold standard requirement is taken into consideration. Archived data were used to establish two groups of patients which are the control and test group. The control group was used to develop the statistical discrimination procedure using four vectors of wavelet coefficients as feature vectors for the detection of pulmonary tuberculosis (PTB), lung cancer (LC), and normal lung (NL). This discrimination procedure was investigated using the test group where the number of sputum positive and sputum negative cases that were correctly classified as PTB cases were noted. The proposed statistical discrimination method is able to detect PTB patients and LC with high true positive fraction. The method is also able to detect PTB patients that are sputum negative and therefore may be used as a complement to the gold standard. Copyright © 2010 Elsevier Ltd. All rights reserved.
Randomization Procedures Applied to Analysis of Ballistic Data
1991-06-01
test,;;15. NUMBER OF PAGES data analysis; computationally intensive statistics ; randomization tests; permutation tests; 16 nonparametric statistics ...be 0.13. 8 Any reasonable statistical procedure would fail to support the notion of improvement of dynamic over standard indexing based on this data ...AD-A238 389 TECHNICAL REPORT BRL-TR-3245 iBRL RANDOMIZATION PROCEDURES APPLIED TO ANALYSIS OF BALLISTIC DATA MALCOLM S. TAYLOR BARRY A. BODT - JUNE
Analysis of Multiple Contingency Tables by Exact Conditional Tests for Zero Partial Association.
ERIC Educational Resources Information Center
Kreiner, Svend
The tests for zero partial association in a multiple contingency table have gained new importance with the introduction of graphical models. It is shown how these may be performed as exact conditional tests, using as test criteria either the ordinary likelihood ratio, the standard x squared statistic, or any other appropriate statistics. A…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1982-01-01
A class of goodness-of-fit estimators is found to provide a useful alternative in certain situations to the standard maximum likelihood method which has some undesirable estimation characteristics for estimation from the three-parameter lognormal distribution. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Filliben tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Robustness of the procedures are examined and example data sets analyzed.
Yang, Yang; DeGruttola, Victor
2016-01-01
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients. PMID:22740584
Yang, Yang; DeGruttola, Victor
2012-06-22
Traditional resampling-based tests for homogeneity in covariance matrices across multiple groups resample residuals, that is, data centered by group means. These residuals do not share the same second moments when the null hypothesis is false, which makes them difficult to use in the setting of multiple testing. An alternative approach is to resample standardized residuals, data centered by group sample means and standardized by group sample covariance matrices. This approach, however, has been observed to inflate type I error when sample size is small or data are generated from heavy-tailed distributions. We propose to improve this approach by using robust estimation for the first and second moments. We discuss two statistics: the Bartlett statistic and a statistic based on eigen-decomposition of sample covariance matrices. Both statistics can be expressed in terms of standardized errors under the null hypothesis. These methods are extended to test homogeneity in correlation matrices. Using simulation studies, we demonstrate that the robust resampling approach provides comparable or superior performance, relative to traditional approaches, for single testing and reasonable performance for multiple testing. The proposed methods are applied to data collected in an HIV vaccine trial to investigate possible determinants, including vaccine status, vaccine-induced immune response level and viral genotype, of unusual correlation pattern between HIV viral load and CD4 count in newly infected patients.
NASA Astrophysics Data System (ADS)
Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.
2018-04-01
Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter halos. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the "accurate" regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard ΛCDM + halo model against the clustering of SDSS DR7 galaxies. Specifically, we use the projected correlation function, group multiplicity function and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir halos) matches the clustering of low luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the "standard" halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.
Ivanova, Maria V.; Hallowell, Brooke
2013-01-01
Background There are a limited number of aphasia language tests in the majority of the world's commonly spoken languages. Furthermore, few aphasia tests in languages other than English have been standardized and normed, and few have supportive psychometric data pertaining to reliability and validity. The lack of standardized assessment tools across many of the world's languages poses serious challenges to clinical practice and research in aphasia. Aims The current review addresses this lack of assessment tools by providing conceptual and statistical guidance for the development of aphasia assessment tools and establishment of their psychometric properties. Main Contribution A list of aphasia tests in the 20 most widely spoken languages is included. The pitfalls of translating an existing test into a new language versus creating a new test are outlined. Factors to consider in determining test content are discussed. Further, a description of test items corresponding to different language functions is provided, with special emphasis on implementing important controls in test design. Next, a broad review of principal psychometric properties relevant to aphasia tests is presented, with specific statistical guidance for establishing psychometric properties of standardized assessment tools. Conclusions This article may be used to help guide future work on developing, standardizing and validating aphasia language tests. The considerations discussed are also applicable to the development of standardized tests of other cognitive functions. PMID:23976813
Ueno, Tamio; Matuda, Junichi; Yamane, Nobuhisa
2013-03-01
To evaluate the occurrence of out-of acceptable ranges and accuracy of antimicrobial susceptibility tests, we applied a new statistical tool to the Inter-Laboratory Quality Control Program established by the Kyushu Quality Control Research Group. First, we defined acceptable ranges of minimum inhibitory concentration (MIC) for broth microdilution tests and inhibitory zone diameter for disk diffusion tests on the basis of Clinical and Laboratory Standards Institute (CLSI) M100-S21. In the analysis, more than two out-of acceptable range results in the 20 tests were considered as not allowable according to the CLSI document. Of the 90 participating laboratories, 46 (51%) experienced one or more occurrences of out-of acceptable range results. Then, a binomial test was applied to each participating laboratory. The results indicated that the occurrences of out-of acceptable range results in the 11 laboratories were significantly higher when compared to the CLSI recommendation (allowable rate < or = 0.05). The standard deviation indices(SDI) were calculated by using reported results, mean and standard deviation values for the respective antimicrobial agents tested. In the evaluation of accuracy, mean value from each laboratory was statistically compared with zero using a Student's t-test. The results revealed that 5 of the 11 above laboratories reported erroneous test results that systematically drifted to the side of resistance. In conclusion, our statistical approach has enabled us to detect significantly higher occurrences and source of interpretive errors in antimicrobial susceptibility tests; therefore, this approach can provide us with additional information that can improve the accuracy of the test results in clinical microbiology laboratories.
Zhang, Fanghong; Miyaoka, Etsuo; Huang, Fuping; Tanaka, Yutaka
2015-01-01
The problem for establishing noninferiority is discussed between a new treatment and a standard (control) treatment with ordinal categorical data. A measure of treatment effect is used and a method of specifying noninferiority margin for the measure is provided. Two Z-type test statistics are proposed where the estimation of variance is constructed under the shifted null hypothesis using U-statistics. Furthermore, the confidence interval and the sample size formula are given based on the proposed test statistics. The proposed procedure is applied to a dataset from a clinical trial. A simulation study is conducted to compare the performance of the proposed test statistics with that of the existing ones, and the results show that the proposed test statistics are better in terms of the deviation from nominal level and the power.
Descriptive and inferential statistical methods used in burns research.
Al-Benna, Sammy; Al-Ajam, Yazan; Way, Benjamin; Steinstraesser, Lars
2010-05-01
Burns research articles utilise a variety of descriptive and inferential methods to present and analyse data. The aim of this study was to determine the descriptive methods (e.g. mean, median, SD, range, etc.) and survey the use of inferential methods (statistical tests) used in articles in the journal Burns. This study defined its population as all original articles published in the journal Burns in 2007. Letters to the editor, brief reports, reviews, and case reports were excluded. Study characteristics, use of descriptive statistics and the number and types of statistical methods employed were evaluated. Of the 51 articles analysed, 11(22%) were randomised controlled trials, 18(35%) were cohort studies, 11(22%) were case control studies and 11(22%) were case series. The study design and objectives were defined in all articles. All articles made use of continuous and descriptive data. Inferential statistics were used in 49(96%) articles. Data dispersion was calculated by standard deviation in 30(59%). Standard error of the mean was quoted in 19(37%). The statistical software product was named in 33(65%). Of the 49 articles that used inferential statistics, the tests were named in 47(96%). The 6 most common tests used (Student's t-test (53%), analysis of variance/co-variance (33%), chi(2) test (27%), Wilcoxon & Mann-Whitney tests (22%), Fisher's exact test (12%)) accounted for the majority (72%) of statistical methods employed. A specified significance level was named in 43(88%) and the exact significance levels were reported in 28(57%). Descriptive analysis and basic statistical techniques account for most of the statistical tests reported. This information should prove useful in deciding which tests should be emphasised in educating burn care professionals. These results highlight the need for burn care professionals to have a sound understanding of basic statistics, which is crucial in interpreting and reporting data. Advice should be sought from professionals in the fields of biostatistics and epidemiology when using more advanced statistical techniques. Copyright 2009 Elsevier Ltd and ISBI. All rights reserved.
NASA Astrophysics Data System (ADS)
Sinha, Manodeep; Berlind, Andreas A.; McBride, Cameron K.; Scoccimarro, Roman; Piscionere, Jennifer A.; Wibking, Benjamin D.
2018-07-01
Interpreting the small-scale clustering of galaxies with halo models can elucidate the connection between galaxies and dark matter haloes. Unfortunately, the modelling is typically not sufficiently accurate for ruling out models statistically. It is thus difficult to use the information encoded in small scales to test cosmological models or probe subtle features of the galaxy-halo connection. In this paper, we attempt to push halo modelling into the `accurate' regime with a fully numerical mock-based methodology and careful treatment of statistical and systematic errors. With our forward-modelling approach, we can incorporate clustering statistics beyond the traditional two-point statistics. We use this modelling methodology to test the standard Λ cold dark matter (ΛCDM) + halo model against the clustering of Sloan Digital Sky Survey (SDSS) seventh data release (DR7) galaxies. Specifically, we use the projected correlation function, group multiplicity function, and galaxy number density as constraints. We find that while the model fits each statistic separately, it struggles to fit them simultaneously. Adding group statistics leads to a more stringent test of the model and significantly tighter constraints on model parameters. We explore the impact of varying the adopted halo definition and cosmological model and find that changing the cosmology makes a significant difference. The most successful model we tried (Planck cosmology with Mvir haloes) matches the clustering of low-luminosity galaxies, but exhibits a 2.3σ tension with the clustering of luminous galaxies, thus providing evidence that the `standard' halo model needs to be extended. This work opens the door to adding interesting freedom to the halo model and including additional clustering statistics as constraints.
CompareTests is an R package to estimate agreement and diagnostic accuracy statistics for two diagnostic tests when one is conducted on only a subsample of specimens. A standard test is observed on all specimens.
ERIC Educational Resources Information Center
King, Molly Elizabeth
2016-01-01
The purpose of this quantitative, causal-comparative study was to compare the effect elementary music and visual arts lessons had on third through sixth grade standardized mathematics test scores. Inferential statistics were used to compare the differences between test scores of students who took in-school, elementary, music instruction during the…
Mysid (Mysidopsis bahia) life-cycle test: Design comparisons and assessment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lussier, S.M.; Champlin, D.; Kuhn, A.
1996-12-31
This study examines ASTM Standard E1191-90, ``Standard Guide for Conducting Life-cycle Toxicity Tests with Saltwater Mysids,`` 1990, using Mysidopsis bahia, by comparing several test designs to assess growth, reproduction, and survival. The primary objective was to determine the most labor efficient and statistically powerful test design for the measurement of statistically detectable effects on biologically sensitive endpoints. Five different test designs were evaluated varying compartment size, number of organisms per compartment and sex ratio. Results showed that while paired organisms in the ASTM design had the highest rate of reproduction among designs tested, no individual design had greater statistical powermore » to detect differences in reproductive effects. Reproduction was not statistically different between organisms paired in the ASTM design and those with randomized sex ratios using larger test compartments. These treatments had numerically higher reproductive success and lower within tank replicate variance than treatments using smaller compartments where organisms were randomized, or had a specific sex ratio. In this study, survival and growth were not statistically different among designs tested. Within tank replicate variability can be reduced by using many exposure compartments with pairs, or few compartments with many organisms in each. While this improves variance within replicate chambers, it does not strengthen the power of detection among treatments in the test. An increase in the number of true replicates (exposure chambers) to eight will have the effect of reducing the percent detectable difference by a factor of two.« less
The Importance of Practice in the Development of Statistics.
1983-01-01
RESOLUTION TEST CHART NATIONAL BUREAU OIF STANDARDS 1963 -A NRC Technical Summary Report #2471 C THE IMORTANCE OF PRACTICE IN to THE DEVELOPMENT OF STATISTICS...component analysis, bioassay, limits for a ratio, quality control, sampling inspection, non-parametric tests , transformation theory, ARIMA time series...models, sequential tests , cumulative sum charts, data analysis plotting techniques, and a resolution of the Bayes - frequentist controversy. It appears
Statistics Using Just One Formula
ERIC Educational Resources Information Center
Rosenthal, Jeffrey S.
2018-01-01
This article advocates that introductory statistics be taught by basing all calculations on a single simple margin-of-error formula and deriving all of the standard introductory statistical concepts (confidence intervals, significance tests, comparisons of means and proportions, etc) from that one formula. It is argued that this approach will…
Application of survival analysis methodology to the quantitative analysis of LC-MS proteomics data.
Tekwe, Carmen D; Carroll, Raymond J; Dabney, Alan R
2012-08-01
Protein abundance in quantitative proteomics is often based on observed spectral features derived from liquid chromatography mass spectrometry (LC-MS) or LC-MS/MS experiments. Peak intensities are largely non-normal in distribution. Furthermore, LC-MS-based proteomics data frequently have large proportions of missing peak intensities due to censoring mechanisms on low-abundance spectral features. Recognizing that the observed peak intensities detected with the LC-MS method are all positive, skewed and often left-censored, we propose using survival methodology to carry out differential expression analysis of proteins. Various standard statistical techniques including non-parametric tests such as the Kolmogorov-Smirnov and Wilcoxon-Mann-Whitney rank sum tests, and the parametric survival model and accelerated failure time-model with log-normal, log-logistic and Weibull distributions were used to detect any differentially expressed proteins. The statistical operating characteristics of each method are explored using both real and simulated datasets. Survival methods generally have greater statistical power than standard differential expression methods when the proportion of missing protein level data is 5% or more. In particular, the AFT models we consider consistently achieve greater statistical power than standard testing procedures, with the discrepancy widening with increasing missingness in the proportions. The testing procedures discussed in this article can all be performed using readily available software such as R. The R codes are provided as supplemental materials. ctekwe@stat.tamu.edu.
Performance of digital RGB reflectance color extraction for plaque lesion
NASA Astrophysics Data System (ADS)
Hashim, Hadzli; Taib, Mohd Nasir; Jailani, Rozita; Sulaiman, Saadiah; Baba, Roshidah
2005-01-01
Several clinical psoriasis lesion groups are been studied for digital RGB color features extraction. Previous works have used samples size that included all the outliers lying beyond the standard deviation factors from the peak histograms. This paper described the statistical performances of the RGB model with and without removing these outliers. Plaque lesion is experimented with other types of psoriasis. The statistical tests are compared with respect to three samples size; the original 90 samples, the first size reduction by removing outliers from 2 standard deviation distances (2SD) and the second size reduction by removing outliers from 1 standard deviation distance (1SD). Quantification of data images through the normal/direct and differential of the conventional reflectance method is considered. Results performances are concluded by observing the error plots with 95% confidence interval and findings of the inference T-tests applied. The statistical tests outcomes have shown that B component for conventional differential method can be used to distinctively classify plaque from the other psoriasis groups in consistent with the error plots finding with an improvement in p-value greater than 0.5.
Gibson, Todd A; Oller, D Kimbrough; Jarmulowicz, Linda
2018-03-01
Receptive standardized vocabulary scores have been found to be much higher than expressive standardized vocabulary scores in children with Spanish as L1, learning L2 (English) in school (Gibson et al., 2012). Here we present evidence suggesting the receptive-expressive gap may be harder to evaluate than previously thought because widely-used standardized tests may not offer comparable normed scores. Furthermore monolingual Spanish-speaking children tested in Mexico and monolingual English-speaking children in the US showed other, yet different statistically significant discrepancies between receptive and expressive scores. Results suggest comparisons across widely used standardized tests in attempts to assess a receptive-expressive gap are precarious.
An Application of Indian Health Service Standards for Alcoholism Programs.
ERIC Educational Resources Information Center
Burns, Thomas R.
1984-01-01
Discusses Phoenix-area applications of 1981 Indian Health Service standards for alcoholism programs. Results of standard statistical techniques note areas of deficiency through application of a one-tailed z test at .05 level of significance. Factor analysis sheds further light on design of standards. Implications for revisions are suggested.…
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.
Low typing endurance in keyboard workers with work-related upper limb disorder
Povlsen, Bo
2011-01-01
Objective To compare results of typing endurance and pain before and after a standardized functional test. Design A standardized previously published typing test on a standard QWERTY keyboard. Setting An outpatient hospital environment. Participants Sixty-one keyboard and mouse operating patients with WRULD and six normal controls. Main outcome measure Pain severity before and after the test, typing endurance and speed were recorded. Results Thirty-two patients could not complete the test before pain reached VAS 5 and this group only typed a mean of 11 minutes. The control group and the remaining group of 29 patients completed the test. Two-tailed student T test was used for evaluation. The endurance was significantly shorter in the patient group that could not complete the test (P < 0.00001) and the pain levels were also higher in this group both before (P = 0.01) and after the test (P = 0.0003). Both patient groups had more pain in the right than the left hand, both before and after typing. Conclusions Low typing endurance correlates statistically with more resting pain in keyboard and mouse operators with work-related upper limb disorder and statistically more pain after a standardized typing test. As the right hands had higher pain levels, typing alone may not be the cause of the pain as the left hand on a QWERTY keyboard does relative more keystrokes than the right hand. PMID:21637395
Decision Support Systems: Applications in Statistics and Hypothesis Testing.
ERIC Educational Resources Information Center
Olsen, Christopher R.; Bozeman, William C.
1988-01-01
Discussion of the selection of appropriate statistical procedures by educators highlights a study conducted to investigate the effectiveness of decision aids in facilitating the use of appropriate statistics. Experimental groups and a control group using a printed flow chart, a computer-based decision aid, and a standard text are described. (11…
Introducing Statistical Inference to Biology Students through Bootstrapping and Randomization
ERIC Educational Resources Information Center
Lock, Robin H.; Lock, Patti Frazer
2008-01-01
Bootstrap methods and randomization tests are increasingly being used as alternatives to standard statistical procedures in biology. They also serve as an effective introduction to the key ideas of statistical inference in introductory courses for biology students. We discuss the use of such simulation based procedures in an integrated curriculum…
A statistical approach to instrument calibration
Robert R. Ziemer; David Strauss
1978-01-01
Summary - It has been found that two instruments will yield different numerical values when used to measure identical points. A statistical approach is presented that can be used to approximate the error associated with the calibration of instruments. Included are standard statistical tests that can be used to determine if a number of successive calibrations of the...
Seol, Hyunsoo
2016-06-01
The purpose of this study was to apply the bootstrap procedure to evaluate how the bootstrapped confidence intervals (CIs) for polytomous Rasch fit statistics might differ according to sample sizes and test lengths in comparison with the rule-of-thumb critical value of misfit. A total of 25 simulated data sets were generated to fit the Rasch measurement and then a total of 1,000 replications were conducted to compute the bootstrapped CIs under each of 25 testing conditions. The results showed that rule-of-thumb critical values for assessing the magnitude of misfit were not applicable because the infit and outfit mean square error statistics showed different magnitudes of variability over testing conditions and the standardized fit statistics did not exactly follow the standard normal distribution. Further, they also do not share the same critical range for the item and person misfit. Based on the results of the study, the bootstrapped CIs can be used to identify misfitting items or persons as they offer a reasonable alternative solution, especially when the distributions of the infit and outfit statistics are not well known and depend on sample size. © The Author(s) 2016.
ENHANCING TEST SENSITIVITY IN TOXICITY TESTING BY USING A STATISTICAL PERFORMANCE STANDARD
Previous reports have shown that within-test sensitivity can vary markedly among laboratories. Experts have advocated an empirical approach to controlling test variability based on the MSD, control means, and other test acceptability criteria. (The MSD represents the smallest dif...
JAN transistor and diode characterization test program
NASA Technical Reports Server (NTRS)
Takeda, H.
1977-01-01
A statistical summary of electrical characterization was performed on JAN diodes and transistors. Parameters are presented with test conditions, mean, standard deviation, lowest reading, 10% point, 90% point and highest reading.
Goodpaster, Aaron M.; Kennedy, Michael A.
2015-01-01
Currently, no standard metrics are used to quantify cluster separation in PCA or PLS-DA scores plots for metabonomics studies or to determine if cluster separation is statistically significant. Lack of such measures makes it virtually impossible to compare independent or inter-laboratory studies and can lead to confusion in the metabonomics literature when authors putatively identify metabolites distinguishing classes of samples based on visual and qualitative inspection of scores plots that exhibit marginal separation. While previous papers have addressed quantification of cluster separation in PCA scores plots, none have advocated routine use of a quantitative measure of separation that is supported by a standard and rigorous assessment of whether or not the cluster separation is statistically significant. Here quantification and statistical significance of separation of group centroids in PCA and PLS-DA scores plots are considered. The Mahalanobis distance is used to quantify the distance between group centroids, and the two-sample Hotelling's T2 test is computed for the data, related to an F-statistic, and then an F-test is applied to determine if the cluster separation is statistically significant. We demonstrate the value of this approach using four datasets containing various degrees of separation, ranging from groups that had no apparent visual cluster separation to groups that had no visual cluster overlap. Widespread adoption of such concrete metrics to quantify and evaluate the statistical significance of PCA and PLS-DA cluster separation would help standardize reporting of metabonomics data. PMID:26246647
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kane, V.E.
1979-10-01
The standard maximum likelihood and moment estimation procedures are shown to have some undesirable characteristics for estimating the parameters in a three-parameter lognormal distribution. A class of goodness-of-fit estimators is found which provides a useful alternative to the standard methods. The class of goodness-of-fit tests considered include the Shapiro-Wilk and Shapiro-Francia tests which reduce to a weighted linear combination of the order statistics that can be maximized in estimation problems. The weighted-order statistic estimators are compared to the standard procedures in Monte Carlo simulations. Bias and robustness of the procedures are examined and example data sets analyzed including geochemical datamore » from the National Uranium Resource Evaluation Program.« less
Assessment issues in the testing of children at school entry.
Rock, Donald A; Stenner, A Jackson
2005-01-01
The authors introduce readers to the research documenting racial and ethnic gaps in school readiness. They describe the key tests, including the Peabody Picture Vocabulary Test (PPVT), the Early Childhood Longitudinal Study (ECLS), and several intelligence tests, and describe how they have been administered to several important national samples of children. Next, the authors review the different estimates of the gaps and discuss how to interpret these differences. In interpreting test results, researchers use the statistical term "standard deviation" to compare scores across the tests. On average, the tests find a gap of about 1 standard deviation. The ECLS-K estimate is the lowest, about half a standard deviation. The PPVT estimate is the highest, sometimes more than 1 standard deviation. When researchers adjust those gaps statistically to take into account different outside factors that might affect children's test scores, such as family income or home environment, the gap narrows but does not disappear. Why such different estimates of the gap? The authors consider explanations such as differences in the samples, racial or ethnic bias in the tests, and whether the tests reflect different aspects of school "readiness," and conclude that none is likely to explain the varying estimates. Another possible explanation is the Spearman Hypothesis-that all tests are imperfect measures of a general ability construct, g; the more highly a given test correlates with g, the larger the gap will be. But the Spearman Hypothesis, too, leaves questions to be investigated. A gap of 1 standard deviation may not seem large, but the authors show clearly how it results in striking disparities in the performance of black and white students and why it should be of serious concern to policymakers.
Assessment of statistical significance and clinical relevance.
Kieser, Meinhard; Friede, Tim; Gondan, Matthias
2013-05-10
In drug development, it is well accepted that a successful study will demonstrate not only a statistically significant result but also a clinically relevant effect size. Whereas standard hypothesis tests are used to demonstrate the former, it is less clear how the latter should be established. In the first part of this paper, we consider the responder analysis approach and study the performance of locally optimal rank tests when the outcome distribution is a mixture of responder and non-responder distributions. We find that these tests are quite sensitive to their planning assumptions and have therefore not really any advantage over standard tests such as the t-test and the Wilcoxon-Mann-Whitney test, which perform overall well and can be recommended for applications. In the second part, we present a new approach to the assessment of clinical relevance based on the so-called relative effect (or probabilistic index) and derive appropriate sample size formulae for the design of studies aiming at demonstrating both a statistically significant and clinically relevant effect. Referring to recent studies in multiple sclerosis, we discuss potential issues in the application of this approach. Copyright © 2012 John Wiley & Sons, Ltd.
Leishmania infection: laboratory diagnosing in the absence of a "gold standard".
Rodríguez-Cortés, Alhelí; Ojeda, Ana; Francino, Olga; López-Fuertes, Laura; Timón, Marcos; Alberola, Jordi
2010-02-01
There is no gold standard for diagnosing leishmaniases. Our aim was to assess the operative validity of tests used in detecting Leishmania infection using samples from experimental infections, a reliable equivalent to the classic definition of gold standard. Without statistical differences, the highest sensitivity was achieved by protein A (ProtA), immunoglobulin (Ig)G2, indirect fluorescenece antibody test (IFAT), lymphocyte proliferation assay, quantitative real-time polymerase chain reaction of bone marrow (qPCR-BM), qPCR-Blood, and IgG; and the highest specificity by IgG1, IgM, IgA, qPCR-Blood, IgG, IgG2, and qPCR-BM. Maximum positive predictive value was obtained simultaneously by IgG2, qPCR-Blood, and IgG; and maximum negative predictive value by qPCR-BM. Best positive and negative likelihood ratios were obtained by IgG2. The test having the greatest, statistically significant, area under the receiver operating characteristics curve was IgG2 enzyme-linked immunosorbent assay (ELISA). Thus, according to the gold standard used, IFAT and qPCR are far from fulfilling the requirements to be considered gold standards, and the test showing the highest potential to detect Leishmania infection is Leishmania-specific ELISA IgG2.
Østergaard, Mia L; Nielsen, Kristina R; Albrecht-Beste, Elisabeth; Konge, Lars; Nielsen, Michael B
2018-01-01
This study aimed to develop a test with validity evidence for abdominal diagnostic ultrasound with a pass/fail-standard to facilitate mastery learning. The simulator had 150 real-life patient abdominal scans of which 15 cases with 44 findings were selected, representing level 1 from The European Federation of Societies for Ultrasound in Medicine and Biology. Four groups of experience levels were constructed: Novices (medical students), trainees (first-year radiology residents), intermediates (third- to fourth-year radiology residents) and advanced (physicians with ultrasound fellowship). Participants were tested in a standardized setup and scored by two blinded reviewers prior to an item analysis. The item analysis excluded 14 diagnoses. Both internal consistency (Cronbach's alpha 0.96) and inter-rater reliability (0.99) were good and there were statistically significant differences (p < 0.001) between all four groups, except the intermediate and advanced groups (p = 1.0). There was a statistically significant correlation between experience and test scores (Pearson's r = 0.82, p < 0.001). The pass/fail-standard failed all novices (no false positives) and passed all advanced (no false negatives). All intermediate participants and six out of 14 trainees passed. We developed a test for diagnostic abdominal ultrasound with solid validity evidence and a pass/fail-standard without any false-positive or false-negative scores. • Ultrasound training can benefit from competency-based education based on reliable tests. • This simulation-based test can differentiate between competency levels of ultrasound examiners. • This test is suitable for competency-based education, e.g. mastery learning. • We provide a pass/fail standard without false-negative or false-positive scores.
An Independent Filter for Gene Set Testing Based on Spectral Enrichment.
Frost, H Robert; Li, Zhigang; Asselbergs, Folkert W; Moore, Jason H
2015-01-01
Gene set testing has become an indispensable tool for the analysis of high-dimensional genomic data. An important motivation for testing gene sets, rather than individual genomic variables, is to improve statistical power by reducing the number of tested hypotheses. Given the dramatic growth in common gene set collections, however, testing is often performed with nearly as many gene sets as underlying genomic variables. To address the challenge to statistical power posed by large gene set collections, we have developed spectral gene set filtering (SGSF), a novel technique for independent filtering of gene set collections prior to gene set testing. The SGSF method uses as a filter statistic the p-value measuring the statistical significance of the association between each gene set and the sample principal components (PCs), taking into account the significance of the associated eigenvalues. Because this filter statistic is independent of standard gene set test statistics under the null hypothesis but dependent under the alternative, the proportion of enriched gene sets is increased without impacting the type I error rate. As shown using simulated and real gene expression data, the SGSF algorithm accurately filters gene sets unrelated to the experimental outcome resulting in significantly increased gene set testing power.
Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression
Chen, Yanguang
2016-01-01
In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271
Xue, Xiaonan; Kim, Mimi Y; Castle, Philip E; Strickler, Howard D
2014-03-01
Studies to evaluate clinical screening tests often face the problem that the "gold standard" diagnostic approach is costly and/or invasive. It is therefore common to verify only a subset of negative screening tests using the gold standard method. However, undersampling the screen negatives can lead to substantial overestimation of the sensitivity and underestimation of the specificity of the diagnostic test. Our objective was to develop a simple and accurate statistical method to address this "verification bias." We developed a weighted generalized estimating equation approach to estimate, in a single model, the accuracy (eg, sensitivity/specificity) of multiple assays and simultaneously compare results between assays while addressing verification bias. This approach can be implemented using standard statistical software. Simulations were conducted to assess the proposed method. An example is provided using a cervical cancer screening trial that compared the accuracy of human papillomavirus and Pap tests, with histologic data as the gold standard. The proposed approach performed well in estimating and comparing the accuracy of multiple assays in the presence of verification bias. The proposed approach is an easy to apply and accurate method for addressing verification bias in studies of multiple screening methods. Copyright © 2014 Elsevier Inc. All rights reserved.
Empirical likelihood-based tests for stochastic ordering
BARMI, HAMMOU EL; MCKEAGUE, IAN W.
2013-01-01
This paper develops an empirical likelihood approach to testing for the presence of stochastic ordering among univariate distributions based on independent random samples from each distribution. The proposed test statistic is formed by integrating a localized empirical likelihood statistic with respect to the empirical distribution of the pooled sample. The asymptotic null distribution of this test statistic is found to have a simple distribution-free representation in terms of standard Brownian bridge processes. The approach is used to compare the lengths of rule of Roman Emperors over various historical periods, including the “decline and fall” phase of the empire. In a simulation study, the power of the proposed test is found to improve substantially upon that of a competing test due to El Barmi and Mukerjee. PMID:23874142
2011-01-01
Background Clinical researchers have often preferred to use a fixed effects model for the primary interpretation of a meta-analysis. Heterogeneity is usually assessed via the well known Q and I2 statistics, along with the random effects estimate they imply. In recent years, alternative methods for quantifying heterogeneity have been proposed, that are based on a 'generalised' Q statistic. Methods We review 18 IPD meta-analyses of RCTs into treatments for cancer, in order to quantify the amount of heterogeneity present and also to discuss practical methods for explaining heterogeneity. Results Differing results were obtained when the standard Q and I2 statistics were used to test for the presence of heterogeneity. The two meta-analyses with the largest amount of heterogeneity were investigated further, and on inspection the straightforward application of a random effects model was not deemed appropriate. Compared to the standard Q statistic, the generalised Q statistic provided a more accurate platform for estimating the amount of heterogeneity in the 18 meta-analyses. Conclusions Explaining heterogeneity via the pre-specification of trial subgroups, graphical diagnostic tools and sensitivity analyses produced a more desirable outcome than an automatic application of the random effects model. Generalised Q statistic methods for quantifying and adjusting for heterogeneity should be incorporated as standard into statistical software. Software is provided to help achieve this aim. PMID:21473747
OSPAR standard method and software for statistical analysis of beach litter data.
Schulz, Marcus; van Loon, Willem; Fleet, David M; Baggelaar, Paul; van der Meulen, Eit
2017-09-15
The aim of this study is to develop standard statistical methods and software for the analysis of beach litter data. The optimal ensemble of statistical methods comprises the Mann-Kendall trend test, the Theil-Sen slope estimation, the Wilcoxon step trend test and basic descriptive statistics. The application of Litter Analyst, a tailor-made software for analysing the results of beach litter surveys, to OSPAR beach litter data from seven beaches bordering on the south-eastern North Sea, revealed 23 significant trends in the abundances of beach litter types for the period 2009-2014. Litter Analyst revealed a large variation in the abundance of litter types between beaches. To reduce the effects of spatial variation, trend analysis of beach litter data can most effectively be performed at the beach or national level. Spatial aggregation of beach litter data within a region is possible, but resulted in a considerable reduction in the number of significant trends. Copyright © 2017 Elsevier Ltd. All rights reserved.
Duarte, Ida Alzira Gomes; Tanaka, Greta Merie; Suzuki, Nathalie Mie; Lazzarini, Rosana; Lopes, Andressa Sato de Aquino; Volpini, Beatrice Mussio Fornazier; Castro, Paulo Carrara de
2013-01-01
A retrospective study was carried out between 2006-2011. Six hundred and eighteen patients with suspected allergic contact dermatitis underwent the standard patch test series recommended by the Brazilian Contact Dermatitis Research Group. The aim of our study was to evaluate the variation of positive patch-test results from standard series year by year. The most frequently positive allergens were: nickel sulfate, thimerosal and potassium bichromate. Decrease of positive patch-test results over the years was statistically significant for: lanolin (p=0.01), neomycin (p=0.01) and anthraquinone (p=0.04). A follow-up study should be useful in determining which allergens could be excluded from standard series, as they may represent low sensitization risk.
The effect of rare variants on inflation of the test statistics in case-control analyses.
Pirie, Ailith; Wood, Angela; Lush, Michael; Tyrer, Jonathan; Pharoah, Paul D P
2015-02-20
The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data. We found evidence of inflation in the median test statistics of the likelihood ratio and score tests for tests of variants with less than 20 heterozygotes across the sample, regardless of the total sample size. The test statistics for the Wald test were under-inflated at the median for variants below the same minor allele frequency. In a genetic association study, if a substantial proportion of the genetic variants tested have rare minor allele frequencies, the properties of the association test may mask the presence or absence of bias due to population structure. The use of either the likelihood ratio test or the score test is likely to lead to inflation in the median test statistic in the absence of population structure. In contrast, the use of the Wald test is likely to result in under-inflation of the median test statistic which may mask the presence of population structure.
Assessing cultural validity in standardized tests in stem education
NASA Astrophysics Data System (ADS)
Gassant, Lunes
This quantitative ex post facto study examined how race and gender, as elements of culture, influence the development of common misconceptions among STEM students. Primary data came from a standardized test: the Digital Logic Concept Inventory (DLCI) developed by Drs. Geoffrey L. Herman, Michael C. Louis, and Craig Zilles from the University of Illinois at Urbana-Champaign. The sample consisted of a cohort of 82 STEM students recruited from three universities in Northern Louisiana. Microsoft Excel and the Statistical Package for the Social Sciences (SPSS) were used for data computation. Two key concepts, several sub concepts, and 19 misconceptions were tested through 11 items in the DLCI. Statistical analyses based on both the Classical Test Theory (Spearman, 1904) and the Item Response Theory (Lord, 1952) yielded similar results: some misconceptions in the DLCI can reliably be predicted by the Race or the Gender of the test taker. The research is significant because it has shown that some misconceptions in a STEM discipline attracted students with similar ethnic backgrounds differently; thus, leading to the existence of some cultural bias in the standardized test. Therefore the study encourages further research in cultural validity in standardized tests. With culturally valid tests, it will be possible to increase the effectiveness of targeted teaching and learning strategies for STEM students from diverse ethnic backgrounds. To some extent, this dissertation has contributed to understanding, better, the gap between high enrollment rates and low graduation rates among African American students and also among other minority students in STEM disciplines.
NASA Astrophysics Data System (ADS)
Wang, Hao; Wang, Qunwei; He, Ming
2018-05-01
In order to investigate and improve the level of detection technology of water content in liquid chemical reagents of domestic laboratories, proficiency testing provider PT0031 (CNAS) has organized proficiency testing program of water content in toluene, 48 laboratories from 18 provinces/cities/municipals took part in the PT. This paper introduces the implementation process of proficiency testing for determination of water content in toluene, including sample preparation, homogeneity and stability test, the results of statistics of iteration robust statistic technique and analysis, summarized and analyzed those of the different test standards which are widely used in the laboratories, put forward the technological suggestions for the improvement of the test quality of water content. Satisfactory results were obtained by 43 laboratories, amounting to 89.6% of the total participating laboratories.
Yang, Hyeri; Na, Jihye; Jang, Won-Hee; Jung, Mi-Sook; Jeon, Jun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Lim, Kyung-Min; Bae, SeungJin
2015-05-05
Mouse local lymph node assay (LLNA, OECD TG429) is an alternative test replacing conventional guinea pig tests (OECD TG406) for the skin sensitization test but the use of a radioisotopic agent, (3)H-thymidine, deters its active dissemination. New non-radioisotopic LLNA, LLNA:BrdU-FCM employs a non-radioisotopic analog, 5-bromo-2'-deoxyuridine (BrdU) and flow cytometry. For an analogous method, OECD TG429 performance standard (PS) advises that two reference compounds be tested repeatedly and ECt(threshold) values obtained must fall within acceptable ranges to prove within- and between-laboratory reproducibility. However, this criteria is somewhat arbitrary and sample size of ECt is less than 5, raising concerns about insufficient reliability. Here, we explored various statistical methods to evaluate the reproducibility of LLNA:BrdU-FCM with stimulation index (SI), the raw data for ECt calculation, produced from 3 laboratories. Descriptive statistics along with graphical representation of SI was presented. For inferential statistics, parametric and non-parametric methods were applied to test the reproducibility of SI of a concurrent positive control and the robustness of results were investigated. Descriptive statistics and graphical representation of SI alone could illustrate the within- and between-laboratory reproducibility. Inferential statistics employing parametric and nonparametric methods drew similar conclusion. While all labs passed within- and between-laboratory reproducibility criteria given by OECD TG429 PS based on ECt values, statistical evaluation based on SI values showed that only two labs succeeded in achieving within-laboratory reproducibility. For those two labs that satisfied the within-lab reproducibility, between-laboratory reproducibility could be also attained based on inferential as well as descriptive statistics. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
JAN transistor and diode characterization test program, JANTX diode 1N5619
NASA Technical Reports Server (NTRS)
Takeda, H.
1977-01-01
A statistical summary of electrical characterization was performed on JANTX 1N5619 silicon diodes. Parameters are presented with test conditions, mean, standard deviation, lowest reading, 10% point, 90% point, and highest reading.
Labor Productivity Standards in Texas School Foodservice Operations
ERIC Educational Resources Information Center
Sherrin, A. Rachelle; Bednar, Carolyn; Kwon, Junehee
2009-01-01
Purpose: Purpose of this research was to investigate utilization of labor productivity standards and variables that affect productivity in Texas school foodservice operations. Methods: A questionnaire was developed, validated, and pilot tested, then mailed to 200 randomly selected Texas school foodservice directors. Descriptive statistics for…
Teacher Effects, Value-Added Models, and Accountability
ERIC Educational Resources Information Center
Konstantopoulos, Spyros
2014-01-01
Background: In the last decade, the effects of teachers on student performance (typically manifested as state-wide standardized tests) have been re-examined using statistical models that are known as value-added models. These statistical models aim to compute the unique contribution of the teachers in promoting student achievement gains from grade…
An ROC-type measure of diagnostic accuracy when the gold standard is continuous-scale.
Obuchowski, Nancy A
2006-02-15
ROC curves and summary measures of accuracy derived from them, such as the area under the ROC curve, have become the standard for describing and comparing the accuracy of diagnostic tests. Methods for estimating ROC curves rely on the existence of a gold standard which dichotomizes patients into disease present or absent. There are, however, many examples of diagnostic tests whose gold standards are not binary-scale, but rather continuous-scale. Unnatural dichotomization of these gold standards leads to bias and inconsistency in estimates of diagnostic accuracy. In this paper, we propose a non-parametric estimator of diagnostic test accuracy which does not require dichotomization of the gold standard. This estimator has an interpretation analogous to the area under the ROC curve. We propose a confidence interval for test accuracy and a statistical test for comparing accuracies of tests from paired designs. We compare the performance (i.e. CI coverage, type I error rate, power) of the proposed methods with several alternatives. An example is presented where the accuracies of two quick blood tests for measuring serum iron concentrations are estimated and compared.
Leishmania Infection: Laboratory Diagnosing in the Absence of a “Gold Standard”
Rodríguez-Cortés, Alhelí; Ojeda, Ana; Francino, Olga; López-Fuertes, Laura; Timón, Marcos; Alberola, Jordi
2010-01-01
There is no gold standard for diagnosing leishmaniases. Our aim was to assess the operative validity of tests used in detecting Leishmania infection using samples from experimental infections, a reliable equivalent to the classic definition of gold standard. Without statistical differences, the highest sensitivity was achieved by protein A (ProtA), immunoglobulin (Ig)G2, indirect fluorescenece antibody test (IFAT), lymphocyte proliferation assay, quantitative real-time polymerase chain reaction of bone marrow (qPCR-BM), qPCR-Blood, and IgG; and the highest specificity by IgG1, IgM, IgA, qPCR-Blood, IgG, IgG2, and qPCR-BM. Maximum positive predictive value was obtained simultaneously by IgG2, qPCR-Blood, and IgG; and maximum negative predictive value by qPCR-BM. Best positive and negative likelihood ratios were obtained by IgG2. The test having the greatest, statistically significant, area under the receiver operating characteristics curve was IgG2 enzyme-linked immunosorbent assay (ELISA). Thus, according to the gold standard used, IFAT and qPCR are far from fulfilling the requirements to be considered gold standards, and the test showing the highest potential to detect Leishmania infection is Leishmania-specific ELISA IgG2. PMID:20134001
A comparison of five serological tests for bovine brucellosis.
Dohoo, I R; Wright, P F; Ruckerbauer, G M; Samagh, B S; Robertson, F J; Forbes, L B
1986-01-01
Five serological assays: the buffered plate antigen test, the standard tube agglutination test, the complement fixation test, the hemolysis-in-gel test and the indirect enzyme immunoassay were diagnostically evaluated. Test data consisted of results from 1208 cattle in brucellosis-free herds, 1578 cattle in reactor herds of unknown infection status and 174 cattle from which Brucella abortus had been cultured. The complement fixation test had the highest specificity in both nonvaccinated and vaccinated cattle. The indirect enzyme immunoassay, if interpreted at a high threshold, also exhibited a high specificity in both groups of cattle. The hemolysis-in-gel test had a very high specificity when used in nonvaccinated cattle but quite a low specificity among vaccinates. With the exception of the complement fixation test, all tests had high sensitivities if interpreted at the minimum threshold. However, the sensitivities of the standard tube agglutination test and indirect enzyme immunoassay, when interpreted at high thresholds were comparable to that of the complement fixation test. A kappa statistic was used to measure the agreement between the various tests. In general the kappa statistics were quite low, suggesting that the various tests may detect different antibody isotypes. There was however, good agreement between the buffered plate antigen test and standard tube agglutination test (the two agglutination tests evaluated) and between the complement fixation test and the indirect enzyme immunoassay when interpreted at a high threshold. With the exception of the buffered plate antigen test, all tests were evaluated as confirmatory tests by estimating their specificity and sensitivity on screening-test positive samples.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3539295
Reducing animal experimentation in foot-and-mouth disease vaccine potency tests.
Reeve, Richard; Cox, Sarah; Smitsaart, Eliana; Beascoechea, Claudia Perez; Haas, Bernd; Maradei, Eduardo; Haydon, Daniel T; Barnett, Paul
2011-07-26
The World Organisation for Animal Health (OIE) Terrestrial Manual and the European Pharmacopoeia (EP) still prescribe live challenge experiments for foot-and-mouth disease virus (FMDV) immunogenicity and vaccine potency tests. However, the EP allows for other validated tests for the latter, and specifically in vitro tests if a "satisfactory pass level" has been determined; serological replacements are also currently in use in South America. Much research has therefore focused on validating both ex vivo and in vitro tests to replace live challenge. However, insufficient attention has been given to the sensitivity and specificity of the "gold standard"in vivo test being replaced, despite this information being critical to determining what should be required of its replacement. This paper aims to redress this imbalance by examining the current live challenge tests and their associated statistics and determining the confidence that we can have in them, thereby setting a standard for candidate replacements. It determines that the statistics associated with the current EP PD(50) test are inappropriate given our domain knowledge, but that the OIE test statistics are satisfactory. However, it has also identified a new set of live animal challenge test regimes that provide similar sensitivity and specificity to all of the currently used OIE tests using fewer animals (16 including controls), and can also provide further savings in live animal experiments in exchange for small reductions in sensitivity and specificity. Copyright © 2011 Elsevier Ltd. All rights reserved.
The Statistical Loop Analyzer (SLA)
NASA Technical Reports Server (NTRS)
Lindsey, W. C.
1985-01-01
The statistical loop analyzer (SLA) is designed to automatically measure the acquisition, tracking and frequency stability performance characteristics of symbol synchronizers, code synchronizers, carrier tracking loops, and coherent transponders. Automated phase lock and system level tests can also be made using the SLA. Standard baseband, carrier and spread spectrum modulation techniques can be accomodated. Through the SLA's phase error jitter and cycle slip measurements the acquisition and tracking thresholds of the unit under test are determined; any false phase and frequency lock events are statistically analyzed and reported in the SLA output in probabilistic terms. Automated signal drop out tests can be performed in order to trouble shoot algorithms and evaluate the reacquisition statistics of the unit under test. Cycle slip rates and cycle slip probabilities can be measured using the SLA. These measurements, combined with bit error probability measurements, are all that are needed to fully characterize the acquisition and tracking performance of a digital communication system.
A Comparison of Student Understanding of Seasons Using Inquiry and Didactic Teaching Methods
NASA Astrophysics Data System (ADS)
Ashcraft, Paul G.
2006-02-01
Student performance on open-ended questions concerning seasons in a university physical science content course was examined to note differences between classes that experienced inquiry using a 5-E lesson planning model and those that experienced the same content with a traditional, didactic lesson. The class examined is a required content course for elementary education majors and understanding the seasons is part of the university's state's elementary science standards. The two self-selected groups of students showed no statistically significant differences in pre-test scores, while there were statistically significant differences between the groups' post-test scores with those who participated in inquiry-based activities scoring higher. There were no statistically significant differences between the pre-test and the post-test for the students who experienced didactic teaching, while there were statistically significant improvements for the students who experienced the 5-E lesson.
NASA Astrophysics Data System (ADS)
Lehmann, Thomas M.
2002-05-01
Reliable evaluation of medical image processing is of major importance for routine applications. Nonetheless, evaluation is often omitted or methodically defective when novel approaches or algorithms are introduced. Adopted from medical diagnosis, we define the following criteria to classify reference standards: 1. Reliance, if the generation or capturing of test images for evaluation follows an exactly determined and reproducible protocol. 2. Equivalence, if the image material or relationships considered within an algorithmic reference standard equal real-life data with respect to structure, noise, or other parameters of importance. 3. Independence, if any reference standard relies on a different procedure than that to be evaluated, or on other images or image modalities than that used routinely. This criterion bans the simultaneous use of one image for both, training and test phase. 4. Relevance, if the algorithm to be evaluated is self-reproducible. If random parameters or optimization strategies are applied, reliability of the algorithm must be shown before the reference standard is applied for evaluation. 5. Significance, if the number of reference standard images that are used for evaluation is sufficient large to enable statistically founded analysis. We demand that a true gold standard must satisfy the Criteria 1 to 3. Any standard only satisfying two criteria, i.e., Criterion 1 and Criterion 2 or Criterion 1 and Criterion 3, is referred to as silver standard. Other standards are termed to be from plastic. Before exhaustive evaluation based on gold or silver standards is performed, its relevance must be shown (Criterion 4) and sufficient tests must be carried out to found statistical analysis (Criterion 5). In this paper, examples are given for each class of reference standards.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Gyanender P.; Gonczy, Steve T.; Deck, Christian P.
An interlaboratory round robin study was conducted on the tensile strength of SiC–SiC ceramic matrix composite (CMC) tubular test specimens at room temperature with the objective of expanding the database of mechanical properties of nuclear grade SiC–SiC and establishing the precision and bias statement for standard test method ASTM C1773. The mechanical properties statistics from the round robin study and the precision statistics and precision statement are presented herein. The data show reasonable consistency across the laboratories, indicating that the current C1773–13 ASTM standard is adequate for testing ceramic fiber reinforced ceramic matrix composite tubular test specimen. Furthermore, it wasmore » found that the distribution of ultimate tensile strength data was best described with a two–parameter Weibull distribution, while a lognormal distribution provided a good description of the distribution of proportional limit stress data.« less
High-Throughput Nanoindentation for Statistical and Spatial Property Determination
NASA Astrophysics Data System (ADS)
Hintsala, Eric D.; Hangen, Ude; Stauffer, Douglas D.
2018-04-01
Standard nanoindentation tests are "high throughput" compared to nearly all other mechanical tests, such as tension or compression. However, the typical rates of tens of tests per hour can be significantly improved. These higher testing rates enable otherwise impractical studies requiring several thousands of indents, such as high-resolution property mapping and detailed statistical studies. However, care must be taken to avoid systematic errors in the measurement, including choosing of the indentation depth/spacing to avoid overlap of plastic zones, pileup, and influence of neighboring microstructural features in the material being tested. Furthermore, since fast loading rates are required, the strain rate sensitivity must also be considered. A review of these effects is given, with the emphasis placed on making complimentary standard nanoindentation measurements to address these issues. Experimental applications of the technique, including mapping of welds, microstructures, and composites with varying length scales, along with studying the effect of surface roughness on nominally homogeneous specimens, will be presented.
Singh, Gyanender P.; Gonczy, Steve T.; Deck, Christian P.; ...
2018-04-19
An interlaboratory round robin study was conducted on the tensile strength of SiC–SiC ceramic matrix composite (CMC) tubular test specimens at room temperature with the objective of expanding the database of mechanical properties of nuclear grade SiC–SiC and establishing the precision and bias statement for standard test method ASTM C1773. The mechanical properties statistics from the round robin study and the precision statistics and precision statement are presented herein. The data show reasonable consistency across the laboratories, indicating that the current C1773–13 ASTM standard is adequate for testing ceramic fiber reinforced ceramic matrix composite tubular test specimen. Furthermore, it wasmore » found that the distribution of ultimate tensile strength data was best described with a two–parameter Weibull distribution, while a lognormal distribution provided a good description of the distribution of proportional limit stress data.« less
Confidence Limits for the Indirect Effect: Distribution of the Product and Resampling Methods
ERIC Educational Resources Information Center
MacKinnon, David P.; Lockwood, Chondra M.; Williams, Jason
2004-01-01
The most commonly used method to test an indirect effect is to divide the estimate of the indirect effect by its standard error and compare the resulting z statistic with a critical value from the standard normal distribution. Confidence limits for the indirect effect are also typically based on critical values from the standard normal…
Black Male Labor Force Participation.
ERIC Educational Resources Information Center
Baer, Roger K.
This study attempts to test (via multiple regression analysis) hypothesized relationships between designated independent variables and age specific incidences of labor force participation for black male subpopulations in 54 Standard Metropolitan Statistical Areas. Leading independent variables tested include net migration, earnings, unemployment,…
Statistical testing of baseline differences in sports medicine RCTs: a systematic evaluation.
Peterson, Ross L; Tran, Matthew; Koffel, Jonathan; Stovitz, Steven D
2017-01-01
The CONSORT (Consolidated Standards of Reporting Trials) statement discourages reporting statistical tests of baseline differences between groups in randomised controlled trials (RCTs). However, this practice is still common in many medical fields. Our aim was to determine the prevalence of this practice in leading sports medicine journals. We conducted a comprehensive search in Medline through PubMed to identify RCTs published in the years 2005 and 2015 from 10 high-impact sports medicine journals. Two reviewers independently confirmed the trial design and reached consensus on which articles contained statistical tests of baseline differences. Our search strategy identified a total of 324 RCTs, with 85 from the year 2005 and 239 from the year 2015. Overall, 64.8% of studies (95% CI (59.6, 70.0)) reported statistical tests of baseline differences; broken down by year, this percentage was 67.1% in 2005 (95% CI (57.1, 77.1)) and 64.0% in 2015 (95% CI (57.9, 70.1)). Although discouraged by the CONSORT statement, statistical testing of baseline differences remains highly prevalent in sports medicine RCTs. Statistical testing of baseline differences can mislead authors; for example, by failing to identify meaningful baseline differences in small studies. Journals that ask authors to follow the CONSORT statement guidelines should recognise that many manuscripts are ignoring the recommendation against statistical testing of baseline differences.
Verification of learner’s differences by team-based learning in biochemistry classes
2017-01-01
Purpose We tested the effect of team-based learning (TBL) on medical education through the second-year premedical students’ TBL scores in biochemistry classes over 5 years. Methods We analyzed the results based on test scores before and after the students’ debate. The groups of students for statistical analysis were divided as follows: group 1 comprised the top-ranked students, group 3 comprised the low-ranked students, and group 2 comprised the medium-ranked students. Therefore, group T comprised 382 students (the total number of students in group 1, 2, and 3). To calibrate the difficulty of the test, original scores were converted into standardized scores. We determined the differences of the tests using Student t-test, and the relationship between scores before, and after the TBL using linear regression tests. Results Although there was a decrease in the lowest score, group T and 3 showed a significant increase in both original and standardized scores; there was also an increase in the standardized score of group 3. There was a positive correlation between the pre- and the post-debate scores in group T, and 2. And the beta values of the pre-debate scores and “the changes between the pre- and post-debate scores” were statistically significant in both original and standardized scores. Conclusion TBL is one of the educational methods for helping students improve their grades, particularly those of low-ranked students. PMID:29207457
Su, Cheng; Zhou, Lei; Hu, Zheng; Weng, Winnie; Subramani, Jayanthi; Tadkod, Vineet; Hamilton, Kortney; Bautista, Ami; Wu, Yu; Chirmule, Narendra; Zhong, Zhandong Don
2015-10-01
Biotherapeutics can elicit immune responses, which can alter the exposure, safety, and efficacy of the therapeutics. A well-designed and robust bioanalytical method is critical for the detection and characterization of relevant anti-drug antibody (ADA) and the success of an immunogenicity study. As a fundamental criterion in immunogenicity testing, assay cut points need to be statistically established with a risk-based approach to reduce subjectivity. This manuscript describes the development of a validated, web-based, multi-tier customized assay statistical tool (CAST) for assessing cut points of ADA assays. The tool provides an intuitive web interface that allows users to import experimental data generated from a standardized experimental design, select the assay factors, run the standardized analysis algorithms, and generate tables, figures, and listings (TFL). It allows bioanalytical scientists to perform complex statistical analysis at a click of the button to produce reliable assay parameters in support of immunogenicity studies. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Currit, P. A.
1983-01-01
The Cleanroom software development methodology is designed to take the gamble out of product releases for both suppliers and receivers of the software. The ingredients of this procedure are a life cycle of executable product increments, representative statistical testing, and a standard estimate of the MTTF (Mean Time To Failure) of the product at the time of its release. A statistical approach to software product testing using randomly selected samples of test cases is considered. A statistical model is defined for the certification process which uses the timing data recorded during test. A reasonableness argument for this model is provided that uses previously published data on software product execution. Also included is a derivation of the certification model estimators and a comparison of the proposed least squares technique with the more commonly used maximum likelihood estimators.
Incorporating Nonparametric Statistics into Delphi Studies in Library and Information Science
ERIC Educational Resources Information Center
Ju, Boryung; Jin, Tao
2013-01-01
Introduction: The Delphi technique is widely used in library and information science research. However, many researchers in the field fail to employ standard statistical tests when using this technique. This makes the technique vulnerable to criticisms of its reliability and validity. The general goal of this article is to explore how…
Reserve Manpower Statistics, 1 January - 31 March 1986.
1986-03-31
This is the first issue of Reserv’e Nztnpowe Statistics , a quarterly publication based upon data from the Reserve Components Common Personnel Data System...1.2~5 MI ’CROCOPY RESOLUTION TEST CHART NATIONAL BUREAU OF STANDARDS-1963-A ,iI M15 o Department of Defense RESERVE MANPOWER STATISTICS __ March 31...1986 GUARD JID I A* LECTE3I ___SEP 17 1986 it % Ii TA 9 WWto’ VubUd reIOWa jiW, ii~ Department of Defense Reserve Manpower Statistics March 31
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2013 CFR
2013-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2014 CFR
2014-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2011 CFR
2011-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
40 CFR 610.10 - Program purpose.
Code of Federal Regulations, 2012 CFR
2012-07-01
... DEVICES Test Procedures and Evaluation Criteria General Provisions § 610.10 Program purpose. (a) The... standardized procedures, the performance of various retrofit devices applicable to automobiles for which fuel... statistical analysis of data from vehicle tests, the evaluation program will determine the effects on fuel...
Infants are superior in implicit crossmodal learning and use other learning mechanisms than adults
von Frieling, Marco; Röder, Brigitte
2017-01-01
During development internal models of the sensory world must be acquired which have to be continuously adapted later. We used event-related potentials (ERP) to test the hypothesis that infants extract crossmodal statistics implicitly while adults learn them when task relevant. Participants were passively exposed to frequent standard audio-visual combinations (A1V1, A2V2, p=0.35 each), rare recombinations of these standard stimuli (A1V2, A2V1, p=0.10 each), and a rare audio-visual deviant with infrequent auditory and visual elements (A3V3, p=0.10). While both six-month-old infants and adults differentiated between rare deviants and standards involving early neural processing stages only infants were sensitive to crossmodal statistics as indicated by a late ERP difference between standard and recombined stimuli. A second experiment revealed that adults differentiated recombined and standard combinations when crossmodal combinations were task relevant. These results demonstrate a heightened sensitivity for crossmodal statistics in infants and a change in learning mode from infancy to adulthood. PMID:28949291
Long, Brandon R.; Rinaldo, Steven G.; Gallagher, Kevin G.; ...
2016-11-09
Coin-cells are often the test format of choice for laboratories engaged in battery research and development as they provide a convenient platform for rapid testing of new materials on a small scale. However, reliable, reproducible data via the coin-cell format is inherently difficult, particularly in the full-cell configuration. In addition, statistical evaluation to prove the consistency and reliability of such data is often neglected. Herein we report on several studies aimed at formalizing physical process parameters and coin-cell construction related to full cells. Statistical analysis and performance benchmarking approaches are advocated as a means to more confidently track changes inmore » cell performance. Finally, we show that trends in the electrochemical data obtained from coin-cells can be reliable and informative when standardized approaches are implemented in a consistent manner.« less
Using R to Simulate Permutation Distributions for Some Elementary Experimental Designs
ERIC Educational Resources Information Center
Eudey, T. Lynn; Kerr, Joshua D.; Trumbo, Bruce E.
2010-01-01
Null distributions of permutation tests for two-sample, paired, and block designs are simulated using the R statistical programming language. For each design and type of data, permutation tests are compared with standard normal-theory and nonparametric tests. These examples (often using real data) provide for classroom discussion use of metrics…
Marketing of Personalized Cancer Care on the Web: An Analysis of Internet Websites
Cronin, Angel; Bair, Elizabeth; Lindeman, Neal; Viswanath, Vish; Janeway, Katherine A.
2015-01-01
Internet marketing may accelerate the use of care based on genomic or tumor-derived data. However, online marketing may be detrimental if it endorses products of unproven benefit. We conducted an analysis of Internet websites to identify personalized cancer medicine (PCM) products and claims. A Delphi Panel categorized PCM as standard or nonstandard based on evidence of clinical utility. Fifty-five websites, sponsored by commercial entities, academic institutions, physicians, research institutes, and organizations, that marketed PCM included somatic (58%) and germline (20%) analysis, interpretive services (15%), and physicians/institutions offering personalized care (44%). Of 32 sites offering somatic analysis, 56% included specific test information (range 1–152 tests). All statistical tests were two-sided, and comparisons of website content were conducted using McNemar’s test. More websites contained information about the benefits than limitations of PCM (85% vs 27%, P < .001). Websites specifying somatic analysis were statistically significantly more likely to market one or more nonstandard tests as compared with standard tests (88% vs 44%, P = .04). PMID:25745021
Prigge, R.; Micke, H.; Krüger, J.
1963-01-01
As part of a collaborative assay of the proposed Fifth International Standard for Gas-Gangrene Antitoxin (Perfringens), five ampoules of the proposed replacement material were assayed in the authors' laboratory against the then current Fourth International Standard. Both in vitro and in vivo methods were used. This paper presents the results and their statistical analysis. The two methods yielded different results which were not likely to have been due to chance, but exact statistical comparison is not possible. It is thought, however, that the differences may be due, at least in part, to differences in the relative proportions of zeta-antitoxin and alpha-antitoxin in the Fourth and Fifth International Standards and the consequent different reactions with the test toxin that was used for titration. PMID:14107746
Sainz de Baranda, Pilar; Rodríguez-Iniesta, María; Ayala, Francisco; Santonja, Fernando; Cejudo, Antonio
2014-07-01
To examine the criterion-related validity of the horizontal hip joint angle (H-HJA) test and vertical hip joint angle (V-HJA) test for estimating hamstring flexibility measured through the passive straight-leg raise (PSLR) test using contemporary statistical measures. Validity study. Controlled laboratory environment. One hundred thirty-eight professional trampoline gymnasts (61 women and 77 men). Hamstring flexibility. Each participant performed 2 trials of H-HJA, V-HJA, and PSLR tests in a randomized order. The criterion-related validity of H-HJA and V-HJA tests was measured through the estimation equation, typical error of the estimate (TEEST), validity correlation (β), and their respective confidence limits. The findings from this study suggest that although H-HJA and V-HJA tests showed moderate to high validity scores for estimating hamstring flexibility (standardized TEEST = 0.63; β = 0.80), the TEEST statistic reported for both tests was not narrow enough for clinical purposes (H-HJA = 10.3 degrees; V-HJA = 9.5 degrees). Subsequently, the predicted likely thresholds for the true values that were generated were too wide (H-HJA = predicted value ± 13.2 degrees; V-HJA = predicted value ± 12.2 degrees). The results suggest that although the HJA test showed moderate to high validity scores for estimating hamstring flexibility, the prediction intervals between the HJA and PSLR tests are not strong enough to suggest that clinicians and sport medicine practitioners should use the HJA and PSLR tests interchangeably as gold standard measurement tools to evaluate and detect short hamstring muscle flexibility.
Statistical methodology: II. Reliability and validity assessment in study design, Part B.
Karras, D J
1997-02-01
Validity measures the correspondence between a test and other purported measures of the same or similar qualities. When a reference standard exists, a criterion-based validity coefficient can be calculated. If no such standard is available, the concepts of content and construct validity may be used, but quantitative analysis may not be possible. The Pearson and Spearman tests of correlation are often used to assess the correspondence between tests, but do not account for measurement biases and may yield misleading results. Techniques that measure interest differences may be more meaningful in validity assessment, and the kappa statistic is useful for analyzing categorical variables. Questionnaires often can be designed to allow quantitative assessment of reliability and validity, although this may be difficult. Inclusion of homogeneous questions is necessary to assess reliability. Analysis is enhanced by using Likert scales or similar techniques that yield ordinal data. Validity assessment of questionnaires requires careful definition of the scope of the test and comparison with previously validated tools.
ERIC Educational Resources Information Center
Nevitt, Jonathan; Hancock, Gregory R.
2001-01-01
Evaluated the bootstrap method under varying conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Results for the bootstrap suggest the resampling-based method may be conservative in its control over model rejections, thus having an impact on the statistical power associated…
1983-06-16
has been advocated by Gnanadesikan and ilk (1969), and others in the literature. This suggests that, if we use the formal signficance test type...American Statistical Asso., 62, 1159-1178. Gnanadesikan , R., and Wilk, M..B. (1969). Data Analytic Methods in Multi- variate Statistical Analysis. In
The Adequacy of Different Robust Statistical Tests in Comparing Two Independent Groups
ERIC Educational Resources Information Center
Pero-Cebollero, Maribel; Guardia-Olmos, Joan
2013-01-01
In the current study, we evaluated various robust statistical methods for comparing two independent groups. Two scenarios for simulation were generated: one of equality and another of population mean differences. In each of the scenarios, 33 experimental conditions were used as a function of sample size, standard deviation and asymmetry. For each…
NASA Technical Reports Server (NTRS)
Colvin, E. L.; Emptage, M. R.
1992-01-01
The breaking load test provides quantitative stress corrosion cracking data by determining the residual strength of tension specimens that have been exposed to corrosive environments. Eight laboratories have participated in a cooperative test program under the auspices of ASTM Committee G-1 to evaluate the new test method. All eight laboratories were able to distinguish between three tempers of aluminum alloy 7075. The statistical analysis procedures that were used in the test program do not work well in all situations. An alternative procedure using Box-Cox transformations shows a great deal of promise. An ASTM standard method has been drafted which incorporates the Box-Cox procedure.
Code of Federal Regulations, 2014 CFR
2014-01-01
... percent, one-sided confidence limit and a sample size of n1. (2) For an energy consumption standard (ECS..., where ECS is the energy consumption standard and t is a statistic based on a 97.5 percent, one-sided...
Code of Federal Regulations, 2013 CFR
2013-01-01
... percent, one-sided confidence limit and a sample size of n1. (2) For an energy consumption standard (ECS..., where ECS is the energy consumption standard and t is a statistic based on a 97.5 percent, one-sided...
Code of Federal Regulations, 2012 CFR
2012-01-01
... percent, one-sided confidence limit and a sample size of n1. (2) For an energy consumption standard (ECS..., where ECS is the energy consumption standard and t is a statistic based on a 97.5 percent, one-sided...
ERIC Educational Resources Information Center
Fulmer, Gavin W.; Polikoff, Morgan S.
2014-01-01
An essential component in school accountability efforts is for assessments to be well-aligned with the standards or curriculum they are intended to measure. However, relatively little prior research has explored methods to determine statistical significance of alignment or misalignment. This study explores analyses of alignment as a special case…
Quantitative Analysis of Standardized Dress Code and Minority Academic Achievement
ERIC Educational Resources Information Center
Proctor, J. R.
2013-01-01
This study was designed to investigate if a statistically significant variance exists in African American and Hispanic students' attendance and Texas Assessment of Knowledge and Skills test scores in mathematics before and after the implementation of a standardized dress code. For almost two decades supporters and opponents of public school…
Assessment of variations in thermal cycle life data of thermal barrier coated rods
NASA Astrophysics Data System (ADS)
Hendricks, R. C.; McDonald, G.
An analysis of thermal cycle life data for 22 thermal barrier coated (TBC) specimens was conducted. The Zr02-8Y203/NiCrAlY plasma spray coated Rene 41 rods were tested in a Mach 0.3 Jet A/air burner flame. All specimens were subjected to the same coating and subsequent test procedures in an effort to control three parametric groups; material properties, geometry and heat flux. Statistically, the data sample space had a mean of 1330 cycles with a standard deviation of 520 cycles. The data were described by normal or log-normal distributions, but other models could also apply; the sample size must be increased to clearly delineate a statistical failure model. The statistical methods were also applied to adhesive/cohesive strength data for 20 TBC discs of the same composition, with similar results. The sample space had a mean of 9 MPa with a standard deviation of 4.2 MPa.
Assessment of variations in thermal cycle life data of thermal barrier coated rods
NASA Technical Reports Server (NTRS)
Hendricks, R. C.; Mcdonald, G.
1981-01-01
An analysis of thermal cycle life data for 22 thermal barrier coated (TBC) specimens was conducted. The Zr02-8Y203/NiCrAlY plasma spray coated Rene 41 rods were tested in a Mach 0.3 Jet A/air burner flame. All specimens were subjected to the same coating and subsequent test procedures in an effort to control three parametric groups; material properties, geometry and heat flux. Statistically, the data sample space had a mean of 1330 cycles with a standard deviation of 520 cycles. The data were described by normal or log-normal distributions, but other models could also apply; the sample size must be increased to clearly delineate a statistical failure model. The statistical methods were also applied to adhesive/cohesive strength data for 20 TBC discs of the same composition, with similar results. The sample space had a mean of 9 MPa with a standard deviation of 4.2 MPa.
Single-Item Measurement of Suicidal Behaviors: Validity and Consequences of Misclassification
Millner, Alexander J.; Lee, Michael D.; Nock, Matthew K.
2015-01-01
Suicide is a leading cause of death worldwide. Although research has made strides in better defining suicidal behaviors, there has been less focus on accurate measurement. Currently, the widespread use of self-report, single-item questions to assess suicide ideation, plans and attempts may contribute to measurement problems and misclassification. We examined the validity of single-item measurement and the potential for statistical errors. Over 1,500 participants completed an online survey containing single-item questions regarding a history of suicidal behaviors, followed by questions with more precise language, multiple response options and narrative responses to examine the validity of single-item questions. We also conducted simulations to test whether common statistical tests are robust against the degree of misclassification produced by the use of single-items. We found that 11.3% of participants that endorsed a single-item suicide attempt measure engaged in behavior that would not meet the standard definition of a suicide attempt. Similarly, 8.8% of those who endorsed a single-item measure of suicide ideation endorsed thoughts that would not meet standard definitions of suicide ideation. Statistical simulations revealed that this level of misclassification substantially decreases statistical power and increases the likelihood of false conclusions from statistical tests. Providing a wider range of response options for each item reduced the misclassification rate by approximately half. Overall, the use of single-item, self-report questions to assess the presence of suicidal behaviors leads to misclassification, increasing the likelihood of statistical decision errors. Improving the measurement of suicidal behaviors is critical to increase understanding and prevention of suicide. PMID:26496707
NASA Astrophysics Data System (ADS)
Jokhio, Gul A.; Syed Mohsin, Sharifah M.; Gul, Yasmeen
2018-04-01
It has been established that Adobe provides, in addition to being sustainable and economic, a better indoor air quality without spending extensive amounts of energy as opposed to the modern synthetic materials. The material, however, suffers from weak structural behaviour when subjected to adverse loading conditions. A wide range of mechanical properties has been reported in literature owing to lack of research and standardization. The present paper presents the statistical analysis of the results that were obtained through compressive and flexural tests on Adobe samples. Adobe specimens with and without wire mesh reinforcement were tested and the results were reported. The statistical analysis of these results presents an interesting read. It has been found that the compressive strength of adobe increases by about 43% after adding a single layer of wire mesh reinforcement. This increase is statistically significant. The flexural response of Adobe has also shown improvement with the addition of wire mesh reinforcement, however, the statistical significance of the same cannot be established.
NASA Astrophysics Data System (ADS)
Maries, Alexandru; Singh, Chandralekha
2015-12-01
It has been found that activation of a stereotype, for example by indicating one's gender before a test, typically alters performance in a way consistent with the stereotype, an effect called "stereotype threat." On a standardized conceptual physics assessment, we found that asking test takers to indicate their gender right before taking the test did not deteriorate performance compared to an equivalent group who did not provide gender information. Although a statistically significant gender gap was present on the standardized test whether or not students indicated their gender, no gender gap was observed on the multiple-choice final exam students took, which included both quantitative and conceptual questions on similar topics.
Integrated Data Collection Analysis (IDCA) Program - Statistical Analysis of RDX Standard Data Sets
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sandstrom, Mary M.; Brown, Geoffrey W.; Preston, Daniel N.
2015-10-30
The Integrated Data Collection Analysis (IDCA) program is conducting a Proficiency Test for Small- Scale Safety and Thermal (SSST) testing of homemade explosives (HMEs). Described here are statistical analyses of the results for impact, friction, electrostatic discharge, and differential scanning calorimetry analysis of the RDX Type II Class 5 standard. The material was tested as a well-characterized standard several times during the proficiency study to assess differences among participants and the range of results that may arise for well-behaved explosive materials. The analyses show that there are detectable differences among the results from IDCA participants. While these differences are statisticallymore » significant, most of them can be disregarded for comparison purposes to assess potential variability when laboratories attempt to measure identical samples using methods assumed to be nominally the same. The results presented in this report include the average sensitivity results for the IDCA participants and the ranges of values obtained. The ranges represent variation about the mean values of the tests of between 26% and 42%. The magnitude of this variation is attributed to differences in operator, method, and environment as well as the use of different instruments that are also of varying age. The results appear to be a good representation of the broader safety testing community based on the range of methods, instruments, and environments included in the IDCA Proficiency Test.« less
Code of Federal Regulations, 2010 CFR
2010-07-01
... and CO emissions: Ci = Max [0 or Ci-1 + Xi−(STD + 0.25 × σ)] Where: Ci = The current CumSum statistic...). Xi = The current emission test result for an individual engine. STD = Emission standard (or family...
NASA Astrophysics Data System (ADS)
Kang, Jee Sun Emily
This study explored how inquiry-based teaching and learning processes occurred in two teachers' diverse 8th grade Physical Science classrooms in a Program Improvement junior high school within the context of high-stakes standardized testing. Instructors for the courses examined included not only the two 8th grade science teachers, but also graduate fellows from a nearby university. Research was drawn from inquiry-based instruction in science education, the achievement gap, and the high stakes testing movement, as well as situated learning theory to understand how opportunities for inquiry were negotiated within the diverse classroom context. Transcripts of taped class sessions; student work samples; interviews of teachers and students; and scores from the California Standards Test in science were collected and analyzed. Findings indicated that the teachers provided structured inquiry in order to support their students in learning about forces and to prepare them for the standardized test. Teachers also supported students in generating evidence-based explanations, connecting inquiry-based investigations with content on forces, proficiently using science vocabulary, and connecting concepts about forces to their daily lives. Findings from classroom data revealed constraints to student learning: students' limited language proficiency, peer counter culture, and limited time. Supports were evidenced as well: graduate fellows' support during investigations, teachers' guided questioning, standardized test preparation, literacy support, and home-school connections. There was no statistical difference in achievement on the Forces Unit test or science standardized test between classes with graduate fellows and without fellows. There was also no statistical difference in student performance between the two teachers' classrooms, even though their teaching styles were very different. However, there was a strong correlation between students' achievement on the chapter test and their achievement on the Forces portion of the CST. Students' English language proficiency and socioeconomic status were also strongly correlated with their achievement on the standardized test. Notwithstanding the constraints of standardized testing, the teachers had students practice the heart of inquiry -- to connect evidence with explanations and process with content. Engaging in inquiry-based instruction provided a context for students, even English language learners, to demonstrate their knowledge of forces. Students had stronger and more detailed ideas about concepts when they engaged in activities that were tightly connected to the concepts, as well as to their lives and experiences.
A simple test of association for contingency tables with multiple column responses.
Decady, Y J; Thomas, D R
2000-09-01
Loughin and Scherer (1998, Biometrics 54, 630-637) investigated tests of association in two-way tables when one of the categorical variables allows for multiple-category responses from individual respondents. Standard chi-squared tests are invalid in this case, and they developed a bootstrap test procedure that provides good control of test levels under the null hypothesis. This procedure and some others that have been proposed are computationally involved and are based on techniques that are relatively unfamiliar to many practitioners. In this paper, the methods introduced by Rao and Scott (1981, Journal of the American Statistical Association 76, 221-230) for analyzing complex survey data are used to develop a simple test based on a corrected chi-squared statistic.
A Statistical Analysis of Data Used in Critical Decision Making by Secondary School Personnel.
ERIC Educational Resources Information Center
Dunn, Charleta J.; Kowitz, Gerald T.
Guidance decisions depend on the validity of standardized tests and teacher judgment records as measures of student achievement. To test this validity, a sample of 400 high school juniors, randomly selected from two large Gulf Coas t area schools, were administered the Iowa Tests of Educational Development. The nine subtest scores and each…
Edjabou, Maklawe Essonanawe; Martín-Fernández, Josep Antoni; Scheutz, Charlotte; Astrup, Thomas Fruergaard
2017-11-01
Data for fractional solid waste composition provide relative magnitudes of individual waste fractions, the percentages of which always sum to 100, thereby connecting them intrinsically. Due to this sum constraint, waste composition data represent closed data, and their interpretation and analysis require statistical methods, other than classical statistics that are suitable only for non-constrained data such as absolute values. However, the closed characteristics of waste composition data are often ignored when analysed. The results of this study showed, for example, that unavoidable animal-derived food waste amounted to 2.21±3.12% with a confidence interval of (-4.03; 8.45), which highlights the problem of the biased negative proportions. A Pearson's correlation test, applied to waste fraction generation (kg mass), indicated a positive correlation between avoidable vegetable food waste and plastic packaging. However, correlation tests applied to waste fraction compositions (percentage values) showed a negative association in this regard, thus demonstrating that statistical analyses applied to compositional waste fraction data, without addressing the closed characteristics of these data, have the potential to generate spurious or misleading results. Therefore, ¨compositional data should be transformed adequately prior to any statistical analysis, such as computing mean, standard deviation and correlation coefficients. Copyright © 2017 Elsevier Ltd. All rights reserved.
Distribution of the two-sample t-test statistic following blinded sample size re-estimation.
Lu, Kaifeng
2016-05-01
We consider the blinded sample size re-estimation based on the simple one-sample variance estimator at an interim analysis. We characterize the exact distribution of the standard two-sample t-test statistic at the final analysis. We describe a simulation algorithm for the evaluation of the probability of rejecting the null hypothesis at given treatment effect. We compare the blinded sample size re-estimation method with two unblinded methods with respect to the empirical type I error, the empirical power, and the empirical distribution of the standard deviation estimator and final sample size. We characterize the type I error inflation across the range of standardized non-inferiority margin for non-inferiority trials, and derive the adjusted significance level to ensure type I error control for given sample size of the internal pilot study. We show that the adjusted significance level increases as the sample size of the internal pilot study increases. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Reporting Practices and Use of Quantitative Methods in Canadian Journal Articles in Psychology.
Counsell, Alyssa; Harlow, Lisa L
2017-05-01
With recent focus on the state of research in psychology, it is essential to assess the nature of the statistical methods and analyses used and reported by psychological researchers. To that end, we investigated the prevalence of different statistical procedures and the nature of statistical reporting practices in recent articles from the four major Canadian psychology journals. The majority of authors evaluated their research hypotheses through the use of analysis of variance (ANOVA), t -tests, and multiple regression. Multivariate approaches were less common. Null hypothesis significance testing remains a popular strategy, but the majority of authors reported a standardized or unstandardized effect size measure alongside their significance test results. Confidence intervals on effect sizes were infrequently employed. Many authors provided minimal details about their statistical analyses and less than a third of the articles presented on data complications such as missing data and violations of statistical assumptions. Strengths of and areas needing improvement for reporting quantitative results are highlighted. The paper concludes with recommendations for how researchers and reviewers can improve comprehension and transparency in statistical reporting.
Interview with Antony John Kunnan on Language Assessment
ERIC Educational Resources Information Center
Nimehchisalem, Vahid
2015-01-01
Antony John Kunnan is a language assessment specialist. His research interests are fairness of tests and testing practice, assessment literacy, research methods and statistics, ethics and standards, and language assessment policy. His most recent publications include a four-volume edited collection of 140 chapters titled "The Companion to…
A New Look at Bias in Aptitude Tests.
ERIC Educational Resources Information Center
Scheuneman, Janice Dowd
1981-01-01
Statistical bias in measurement and ethnic-group bias in testing are discussed, reviewing predictive and construct validity studies. Item bias is reconceptualized to include distance of item content from respondent's experience. Differing values of mean and standard deviation for bias parameter are analyzed in a simulation. References are…
Test Standards for Contingency Base Waste-to-Energy Technologies
2015-08-01
test runs are preferred to allow a more comprehensive statistical evaluation of the results. In 8 • Minimize the complexity , difficulty, and...with water or, in the case of cyanide - or sulfide-bearing wastes, when exposed to mild acidic or basic conditions; 4) explode when subjected to a
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
ERIC Educational Resources Information Center
Beasley, T. Mark
2014-01-01
Increasing the correlation between the independent variable and the mediator ("a" coefficient) increases the effect size ("ab") for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation caused by…
Grade Equivalents: We Report Them, You Should Too.
ERIC Educational Resources Information Center
Ligon, Glynn; Battaile, Richard
In certain situations, grade equivalent scores are the most appropriate statistic available for reporting achievement test data. It is noted that testing practitioners have found that raw scores, normal curve equivalents, stanines, and standard scores are very useful. However, it is best to convert to either grade equivalents or percentiles before…
Observed-Score Equating with a Heterogeneous Target Population
ERIC Educational Resources Information Center
Duong, Minh Q.; von Davier, Alina A.
2012-01-01
Test equating is a statistical procedure for adjusting for test form differences in difficulty in a standardized assessment. Equating results are supposed to hold for a specified target population (Kolen & Brennan, 2004; von Davier, Holland, & Thayer, 2004) and to be (relatively) independent of the subpopulations from the target population (see…
Development and standardization of Arabic words in noise test in Egyptian children.
Abdel Rahman, Tayseer Taha
2018-05-01
To develop and establish norms of Arabic Words in Noise test in Egyptian children. Total number of participants was 152 with normal hearing and ranging in age from 5 to 12 years. They are subdivided into two main groups (standardization group) which comprised 120 children with normal scholastic achievement and (application group) which comprised 32 children with different types of central auditory processing disorders. Arabic version of both Speech perception in noise (SPIN) and Words in Noise (WIN) tests were presented in each ear at zero signal to-noise ratio (SNR) using ipsilateral Cafeteria noise fixed at 50 dB sensation level (dBSL). The least performance in WIN test occurred between 5 and 7 years and highest scores from 9 to 12 years. However, no statistically significant difference was found among the three standardization age groups. Moreover, no statistically significant difference was found between the right and left ears scores or among the three lists. When the WIN test was compared to SPIN test in children with and without abnormal SPIN scores it showed highly consistent results except in children suffering from memory deficit reflecting that WIN test is more accurate than SPIN in this group of children. The Arabic WIN test can be used in children as young as 5 years. Also, it can be a good cross check test with SPIN test or used to follow up children after rehabilitation program in hearing impaired children or follow up after central auditory remediation of children with selective auditory attention deficit. Copyright © 2017. Published by Elsevier B.V.
Further statistics in dentistry, Part 5: Diagnostic tests for oral conditions.
Petrie, A; Bulman, J S; Osborn, J F
2002-12-07
A diagnostic test is a simple test, sometimes based on a clinical measurement, which is used when the gold-standard test providing a definitive diagnosis of a given condition is too expensive, invasive or time-consuming to perform. The diagnostic test can be used to diagnose a dental condition in an individual patient or as a screening device in a population of apparently healthy individuals.
NASA Technical Reports Server (NTRS)
Tripp, John S.; Tcheng, Ping
1999-01-01
Statistical tools, previously developed for nonlinear least-squares estimation of multivariate sensor calibration parameters and the associated calibration uncertainty analysis, have been applied to single- and multiple-axis inertial model attitude sensors used in wind tunnel testing to measure angle of attack and roll angle. The analysis provides confidence and prediction intervals of calibrated sensor measurement uncertainty as functions of applied input pitch and roll angles. A comparative performance study of various experimental designs for inertial sensor calibration is presented along with corroborating experimental data. The importance of replicated calibrations over extended time periods has been emphasized; replication provides independent estimates of calibration precision and bias uncertainties, statistical tests for calibration or modeling bias uncertainty, and statistical tests for sensor parameter drift over time. A set of recommendations for a new standardized model attitude sensor calibration method and usage procedures is included. The statistical information provided by these procedures is necessary for the uncertainty analysis of aerospace test results now required by users of industrial wind tunnel test facilities.
ERIC Educational Resources Information Center
Gadway, Charles J.; Wilson, H.A.
This document provides statistical data on the 1974 and 1975 Mini-Assessment of Functional Literacy, which was designed to determine the extent of functional literacy among seventeen year olds in America. Also presented are data from comparable test items from the 1971 assessment. Three standards are presented, to allow different methods of…
75 FR 4323 - Additional Quantitative Fit-testing Protocols for the Respiratory Protection Standard
Federal Register 2010, 2011, 2012, 2013, 2014
2010-01-27
... respirators (500 and 1000 for protocols 1 and 2, respectively). However, OSHA could not evaluate the results... the values of these descriptive statistics for revised PortaCount[supreg] QNFT protocols 1 (at RFFs of 100 and 500) and 2 (at RFFs of 200 and 1000). Table 2--Descriptive Statistics for RFFs of 100 and 200...
[Do we always correctly interpret the results of statistical nonparametric tests].
Moczko, Jerzy A
2014-01-01
Mann-Whitney, Wilcoxon, Kruskal-Wallis and Friedman tests create a group of commonly used tests to analyze the results of clinical and laboratory data. These tests are considered to be extremely flexible and their asymptotic relative efficiency exceeds 95 percent. Compared with the corresponding parametric tests they do not require checking the fulfillment of the conditions such as the normality of data distribution, homogeneity of variance, the lack of correlation means and standard deviations, etc. They can be used both in the interval and or-dinal scales. The article presents an example Mann-Whitney test, that does not in any case the choice of these four nonparametric tests treated as a kind of gold standard leads to correct inference.
ERIC Educational Resources Information Center
Gidey, Mu'uz
2015-01-01
This action research is carried out in a practical class room setting to devise an innovative way of administering tutorial classes to improve students' learning competence with particular reference to gendered test scores. A before-after test score analyses of mean and standard deviations along with t-statistical tests of hypotheses of second…
Nour-Eldein, Hebatallah
2016-01-01
With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles.
Nour-Eldein, Hebatallah
2016-01-01
Background: With limited statistical knowledge of most physicians it is not uncommon to find statistical errors in research articles. Objectives: To determine the statistical methods and to assess the statistical errors in family medicine (FM) research articles that were published between 2010 and 2014. Methods: This was a cross-sectional study. All 66 FM research articles that were published over 5 years by FM authors with affiliation to Suez Canal University were screened by the researcher between May and August 2015. Types and frequencies of statistical methods were reviewed in all 66 FM articles. All 60 articles with identified inferential statistics were examined for statistical errors and deficiencies. A comprehensive 58-item checklist based on statistical guidelines was used to evaluate the statistical quality of FM articles. Results: Inferential methods were recorded in 62/66 (93.9%) of FM articles. Advanced analyses were used in 29/66 (43.9%). Contingency tables 38/66 (57.6%), regression (logistic, linear) 26/66 (39.4%), and t-test 17/66 (25.8%) were the most commonly used inferential tests. Within 60 FM articles with identified inferential statistics, no prior sample size 19/60 (31.7%), application of wrong statistical tests 17/60 (28.3%), incomplete documentation of statistics 59/60 (98.3%), reporting P value without test statistics 32/60 (53.3%), no reporting confidence interval with effect size measures 12/60 (20.0%), use of mean (standard deviation) to describe ordinal/nonnormal data 8/60 (13.3%), and errors related to interpretation were mainly for conclusions without support by the study data 5/60 (8.3%). Conclusion: Inferential statistics were used in the majority of FM articles. Data analysis and reporting statistics are areas for improvement in FM research articles. PMID:27453839
Federal Register 2010, 2011, 2012, 2013, 2014
2010-10-08
... conventional rounding rules, emission totals listed in Tables 1 and 2 may not reflect the absolute mathematical... absolute mathematical totals. As shown in Table 2 above, the Nashville Area is projected to steadily...-based standard, the air quality design value is simply the standard- related test statistic. Thus, for...
Long-range memory and non-Markov statistical effects in human sensorimotor coordination
NASA Astrophysics Data System (ADS)
M. Yulmetyev, Renat; Emelyanova, Natalya; Hänggi, Peter; Gafarov, Fail; Prokhorov, Alexander
2002-12-01
In this paper, the non-Markov statistical processes and long-range memory effects in human sensorimotor coordination are investigated. The theoretical basis of this study is the statistical theory of non-stationary discrete non-Markov processes in complex systems (Phys. Rev. E 62, 6178 (2000)). The human sensorimotor coordination was experimentally studied by means of standard dynamical tapping test on the group of 32 young peoples with tap numbers up to 400. This test was carried out separately for the right and the left hand according to the degree of domination of each brain hemisphere. The numerical analysis of the experimental results was made with the help of power spectra of the initial time correlation function, the memory functions of low orders and the first three points of the statistical spectrum of non-Markovity parameter. Our observations demonstrate, that with the regard to results of the standard dynamic tapping-test it is possible to divide all examinees into five different dynamic types. We have introduced the conflict coefficient to estimate quantitatively the order-disorder effects underlying life systems. The last one reflects the existence of disbalance between the nervous and the motor human coordination. The suggested classification of the neurophysiological activity represents the dynamic generalization of the well-known neuropsychological types and provides the new approach in a modern neuropsychology.
Lee, Juneyoung; Kim, Kyung Won; Choi, Sang Hyun; Huh, Jimi
2015-01-01
Meta-analysis of diagnostic test accuracy studies differs from the usual meta-analysis of therapeutic/interventional studies in that, it is required to simultaneously analyze a pair of two outcome measures such as sensitivity and specificity, instead of a single outcome. Since sensitivity and specificity are generally inversely correlated and could be affected by a threshold effect, more sophisticated statistical methods are required for the meta-analysis of diagnostic test accuracy. Hierarchical models including the bivariate model and the hierarchical summary receiver operating characteristic model are increasingly being accepted as standard methods for meta-analysis of diagnostic test accuracy studies. We provide a conceptual review of statistical methods currently used and recommended for meta-analysis of diagnostic test accuracy studies. This article could serve as a methodological reference for those who perform systematic review and meta-analysis of diagnostic test accuracy studies. PMID:26576107
An operational definition of a statistically meaningful trend.
Bryhn, Andreas C; Dimberg, Peter H
2011-04-28
Linear trend analysis of time series is standard procedure in many scientific disciplines. If the number of data is large, a trend may be statistically significant even if data are scattered far from the trend line. This study introduces and tests a quality criterion for time trends referred to as statistical meaningfulness, which is a stricter quality criterion for trends than high statistical significance. The time series is divided into intervals and interval mean values are calculated. Thereafter, r(2) and p values are calculated from regressions concerning time and interval mean values. If r(2) ≥ 0.65 at p ≤ 0.05 in any of these regressions, then the trend is regarded as statistically meaningful. Out of ten investigated time series from different scientific disciplines, five displayed statistically meaningful trends. A Microsoft Excel application (add-in) was developed which can perform statistical meaningfulness tests and which may increase the operationality of the test. The presented method for distinguishing statistically meaningful trends should be reasonably uncomplicated for researchers with basic statistics skills and may thus be useful for determining which trends are worth analysing further, for instance with respect to causal factors. The method can also be used for determining which segments of a time trend may be particularly worthwhile to focus on.
Palmisano, Aldo N.; Elder, N.E.
2001-01-01
We examined, under standardized conditions, seawater survival of chinook salmon Oncorhynchus tshawytscha at the smolt stage to evaluate the experimental hatchery practices applied to their rearing. The experimental rearing practices included rearing fish at different densities; attempting to control bacterial kidney disease with broodstock segregation, erythromycin injection, and an experimental diet; rearing fish on different water sources; and freeze branding the fish. After application of experimental rearing practices in hatcheries, smolts were transported to a rearing facility for about 2-3 months of seawater rearing. Of 16 experiments, 4 yielded statistically significant differences in seawater survival. In general we found that high variability among replicates, plus the low numbers of replicates available, resulted in low statistical power. We recommend including four or five replicates and using ?? = 0.10 in 1-tailed tests of hatchery experiments to try to increase the statistical power to 0.80.
Katki, Hormuzd A; Schiffman, Mark
2018-05-01
Our work involves assessing whether new biomarkers might be useful for cervical-cancer screening across populations with different disease prevalences and biomarker distributions. When comparing across populations, we show that standard diagnostic accuracy statistics (predictive values, risk-differences, Youden's index and Area Under the Curve (AUC)) can easily be misinterpreted. We introduce an intuitively simple statistic for a 2 × 2 table, Mean Risk Stratification (MRS): the average change in risk (pre-test vs. post-test) revealed for tested individuals. High MRS implies better risk separation achieved by testing. MRS has 3 key advantages for comparing test performance across populations with different disease prevalences and biomarker distributions. First, MRS demonstrates that conventional predictive values and the risk-difference do not measure risk-stratification because they do not account for test-positivity rates. Second, Youden's index and AUC measure only multiplicative relative gains in risk-stratification: AUC = 0.6 achieves only 20% of maximum risk-stratification (AUC = 0.9 achieves 80%). Third, large relative gains in risk-stratification might not imply large absolute gains if disease is rare, demonstrating a "high-bar" to justify population-based screening for rare diseases such as cancer. We illustrate MRS by our experience comparing the performance of cervical-cancer screening tests in China vs. the USA. The test with the worst AUC = 0.72 in China (visual inspection with acetic acid) provides twice the risk-stratification (i.e. MRS) of the test with best AUC = 0.83 in the USA (human papillomavirus and Pap cotesting) because China has three times more cervical precancer/cancer. MRS could be routinely calculated to better understand the clinical/public-health implications of standard diagnostic accuracy statistics. Published by Elsevier Inc.
Mental Disorder Hospitalizations among Submarine Personnel in the U.S. Navy.
1988-03-10
hospitalization rates ( Lilienfeld , 1980). T- tests were used to assess statistical signi- ficance of differences in descriptive variables (McNemar, 1969... Lilienfeld , D. E. Foundations of epidemiology. 2nd ed. New York: Oxford University Press, 1980. McNemar, Q. Psychological statistics. 4th ed. New York: Wiley...0 5/S I milI’l 11 1 ; 28 112.5 U-2 11112.2 II~.2.4~ 11111J.6 MICROCOPY RESOLUTION TEST CHART NADONAL BUJR[AU OF STANDARDS Ib3 A IvM Mental
Waites, Anthony B; Mannfolk, Peter; Shaw, Marnie E; Olsrud, Johan; Jackson, Graeme D
2007-02-01
Clinical functional magnetic resonance imaging (fMRI) occasionally fails to detect significant activation, often due to variability in task performance. The present study seeks to test whether a more flexible statistical analysis can better detect activation, by accounting for variance associated with variable compliance to the task over time. Experimental results and simulated data both confirm that even at 80% compliance to the task, such a flexible model outperforms standard statistical analysis when assessed using the extent of activation (experimental data), goodness of fit (experimental data), and area under the operator characteristic curve (simulated data). Furthermore, retrospective examination of 14 clinical fMRI examinations reveals that in patients where the standard statistical approach yields activation, there is a measurable gain in model performance in adopting the flexible statistical model, with little or no penalty in lost sensitivity. This indicates that a flexible model should be considered, particularly for clinical patients who may have difficulty complying fully with the study task.
Clauson, Kevin A; Polen, Hyla H; Peak, Amy S; Marsh, Wallace A; DiScala, Sandra L
2008-11-01
Clinical decision support tools (CDSTs) on personal digital assistants (PDAs) and online databases assist healthcare practitioners who make decisions about dietary supplements. To assess and compare the content of PDA dietary supplement databases and their online counterparts used as CDSTs. A total of 102 question-and-answer pairs were developed within 10 weighted categories of the most clinically relevant aspects of dietary supplement therapy. PDA versions of AltMedDex, Lexi-Natural, Natural Medicines Comprehensive Database, and Natural Standard and their online counterparts were assessed by scope (percent of correct answers present), completeness (3-point scale), ease of use, and a composite score integrating all 3 criteria. Descriptive statistics and inferential statistics, including a chi(2) test, Scheffé's multiple comparison test, McNemar's test, and the Wilcoxon signed rank test were used to analyze data. The scope scores for PDA databases were: Natural Medicines Comprehensive Database 84.3%, Natural Standard 58.8%, Lexi-Natural 50.0%, and AltMedDex 36.3%, with Natural Medicines Comprehensive Database statistically superior (p < 0.01). Completeness scores were: Natural Medicines Comprehensive Database 78.4%, Natural Standard 51.0%, Lexi-Natural 43.5%, and AltMedDex 29.7%. Lexi-Natural was superior in ease of use (p < 0.01). Composite scores for PDA databases were: Natural Medicines Comprehensive Database 79.3, Natural Standard 53.0, Lexi-Natural 48.0, and AltMedDex 32.5, with Natural Medicines Comprehensive Database superior (p < 0.01). There was no difference between the scope for PDA and online database pairs with Lexi-Natural (50.0% and 53.9%, respectively) or Natural Medicines Comprehensive Database (84.3% and 84.3%, respectively) (p > 0.05), whereas differences existed for AltMedDex (36.3% vs 74.5%, respectively) and Natural Standard (58.8% vs 80.4%, respectively) (p < 0.01). For composite scores, AltMedDex and Natural Standard online were better than their PDA counterparts (p < 0.01). Natural Medicines Comprehensive Database achieved significantly higher scope, completeness, and composite scores compared with other dietary supplement PDA CDSTs in this study. There was no difference between the PDA and online databases for Lexi-Natural and Natural Medicines Comprehensive Database, whereas online versions of AltMedDex and Natural Standard were significantly better than their PDA counterparts.
Silbernagel, Karen M; Jechorek, Robert P; Kaufer, Amanda L; Johnson, Ronald L; Aleo, V; Brown, B; Buen, M; Buresh, J; Carson, M; Franklin, J; Ham, P; Humes, L; Husby, G; Hutchins, J; Jechorek, R; Jenkins, J; Kaufer, A; Kexel, N; Kora, L; Lam, L; Lau, D; Leighton, S; Loftis, M; Luc, S; Martin, J; Nacar, I; Nogle, J; Park, J; Schultz, A; Seymore, D; Smith, C; Smith, J; Thou, P; Ulmer, M; Voss, R; Weaver, V
2005-01-01
A multilaboratory study was conducted to compare the VIDAS LIS immunoassay with the standard cultural methods for the detection of Listeria in foods using an enrichment modification of AOAC Official Method 999.06. The modified enrichment protocol was implemented to harmonize the VIDAS LIS assay with the VIDAS LMO2 assay. Five food types--brie cheese, vanilla ice cream, frozen green beans, frozen raw tilapia fish, and cooked roast beef--at 3 inoculation levels, were analyzed by each method. A total of 15 laboratories representing government and industry participated. In this study, 1206 test portions were tested, of which 1170 were used in the statistical analysis. There were 433 positive by the VIDAS LIS assay and 396 positive by the standard culture methods. A Chi-square analysis of each of the 5 food types, at the 3 inoculation levels tested, was performed. The resulting average Chi square analysis, 0.42, indicated that, overall, there are no statistical differences between the VIDAS LIS assay and the standard methods at the 5% level of significance.
Mendell, M J; Eliseeva, E A; Davies, M M; Lobscheid, A
2016-08-01
Limited evidence has associated lower ventilation rates (VRs) in schools with reduced student learning or achievement. We analyzed longitudinal data collected over two school years from 150 classrooms in 28 schools within three California school districts. We estimated daily classroom VRs from real-time indoor carbon dioxide measured by web-connected sensors. School districts provided individual-level scores on standard tests in Math and English, and classroom-level demographic data. Analyses assessing learning effects used two VR metrics: average VRs for 30 days prior to tests, and proportion of prior daily VRs above specified thresholds during the year. We estimated relationships between scores and VR metrics in multivariate models with generalized estimating equations. All school districts had median school-year VRs below the California VR standard. Most models showed some positive associations of VRs with test scores; however, estimates varied in magnitude and few 95% confidence intervals excluded the null. Combined-district models estimated statistically significant increases of 0.6 points (P = 0.01) on English tests for each 10% increase in prior 30-day VRs. Estimated increases in Math were of similar magnitude but not statistically significant. Findings suggest potential small positive associations between classroom VRs and learning. Published 2015. This article is a U.S. Government work and is in the public domain in the USA.
ERIC Educational Resources Information Center
Airola, Denise Tobin
2011-01-01
Changes to state tests impact the ability of State Education Agencies (SEAs) to monitor change in performance over time. The purpose of this study was to evaluate the Standardized Performance Growth Index (PGIz), a proposed statistical model for measuring change in student and school performance, across transitions in tests. The PGIz is a…
ERIC Educational Resources Information Center
Stoneberg, Bert D.
2016-01-01
Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests (ISAT). ISAT results have been have been reported almost exclusively as "percent proficient" statistics (i.e., the percentage of Idaho students who performed at the "A" level…
ERIC Educational Resources Information Center
Stoneberg, Bert D.
2018-01-01
Idaho uses the English Language Arts and Mathematics tests from the Smarter Balanced Assessment Consortium (SBAC) for the Idaho Standard Achievement Tests. ISAT results have been reported almost exclusively as "percent proficient or above" statistics (i.e., the percentage of Idaho students who performed at the "A" level). This…
Review of research designs and statistical methods employed in dental postgraduate dissertations.
Shirahatti, Ravi V; Hegde-Shetiya, Sahana
2015-01-01
There is a need to evaluate the quality of postgraduate dissertations of dentistry submitted to university in the light of the international standards of reporting. We conducted the review with an objective to document the use of sampling methods, measurement standardization, blinding, methods to eliminate bias, appropriate use of statistical tests, appropriate use of data presentation in postgraduate dental research and suggest and recommend modifications. The public access database of the dissertations from Rajiv Gandhi University of Health Sciences was reviewed. Three hundred and thirty-three eligible dissertations underwent preliminary evaluation followed by detailed evaluation of 10% of randomly selected dissertations. The dissertations were assessed based on international reporting guidelines such as strengthening the reporting of observational studies in epidemiology (STROBE), consolidated standards of reporting trials (CONSORT), and other scholarly resources. The data were compiled using MS Excel and SPSS 10.0. Numbers and percentages were used for describing the data. The "in vitro" studies were the most common type of research (39%), followed by observational (32%) and experimental studies (29%). The disciplines conservative dentistry (92%) and prosthodontics (75%) reported high numbers of in vitro research. Disciplines oral surgery (80%) and periodontics (67%) had conducted experimental studies as a major share of their research. Lacunae in the studies included observational studies not following random sampling (70%), experimental studies not following random allocation (75%), not mentioning about blinding, confounding variables and calibrations in measurements, misrepresenting the data by inappropriate data presentation, errors in reporting probability values and not reporting confidence intervals. Few studies showed grossly inappropriate choice of statistical tests and many studies needed additional tests. Overall observations indicated the need to comply with standard guidelines of reporting research.
Marketing of personalized cancer care on the web: an analysis of Internet websites.
Gray, Stacy W; Cronin, Angel; Bair, Elizabeth; Lindeman, Neal; Viswanath, Vish; Janeway, Katherine A
2015-05-01
Internet marketing may accelerate the use of care based on genomic or tumor-derived data. However, online marketing may be detrimental if it endorses products of unproven benefit. We conducted an analysis of Internet websites to identify personalized cancer medicine (PCM) products and claims. A Delphi Panel categorized PCM as standard or nonstandard based on evidence of clinical utility. Fifty-five websites, sponsored by commercial entities, academic institutions, physicians, research institutes, and organizations, that marketed PCM included somatic (58%) and germline (20%) analysis, interpretive services (15%), and physicians/institutions offering personalized care (44%). Of 32 sites offering somatic analysis, 56% included specific test information (range 1-152 tests). All statistical tests were two-sided, and comparisons of website content were conducted using McNemar's test. More websites contained information about the benefits than limitations of PCM (85% vs 27%, P < .001). Websites specifying somatic analysis were statistically significantly more likely to market one or more nonstandard tests as compared with standard tests (88% vs 44%, P = .04). © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
A new exact and more powerful unconditional test of no treatment effect from binary matched pairs.
Lloyd, Chris J
2008-09-01
We consider the problem of testing for a difference in the probability of success from matched binary pairs. Starting with three standard inexact tests, the nuisance parameter is first estimated and then the residual dependence is eliminated by maximization, producing what I call an E+M P-value. The E+M P-value based on McNemar's statistic is shown numerically to dominate previous suggestions, including partially maximized P-values as described in Berger and Sidik (2003, Statistical Methods in Medical Research 12, 91-108). The latter method, however, may have computational advantages for large samples.
Quality evaluation of no-reference MR images using multidirectional filters and image statistics.
Jang, Jinseong; Bang, Kihun; Jang, Hanbyol; Hwang, Dosik
2018-09-01
This study aimed to develop a fully automatic, no-reference image-quality assessment (IQA) method for MR images. New quality-aware features were obtained by applying multidirectional filters to MR images and examining the feature statistics. A histogram of these features was then fitted to a generalized Gaussian distribution function for which the shape parameters yielded different values depending on the type of distortion in the MR image. Standard feature statistics were established through a training process based on high-quality MR images without distortion. Subsequently, the feature statistics of a test MR image were calculated and compared with the standards. The quality score was calculated as the difference between the shape parameters of the test image and the undistorted standard images. The proposed IQA method showed a >0.99 correlation with the conventional full-reference assessment methods; accordingly, this proposed method yielded the best performance among no-reference IQA methods for images containing six types of synthetic, MR-specific distortions. In addition, for authentically distorted images, the proposed method yielded the highest correlation with subjective assessments by human observers, thus demonstrating its superior performance over other no-reference IQAs. Our proposed IQA was designed to consider MR-specific features and outperformed other no-reference IQAs designed mainly for photographic images. Magn Reson Med 80:914-924, 2018. © 2018 International Society for Magnetic Resonance in Medicine. © 2018 International Society for Magnetic Resonance in Medicine.
Murphy, Thomas; Schwedock, Julie; Nguyen, Kham; Mills, Anna; Jones, David
2015-01-01
New recommendations for the validation of rapid microbiological methods have been included in the revised Technical Report 33 release from the PDA. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This case study applies those statistical methods to accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological methods system being evaluated for water bioburden testing. Results presented demonstrate that the statistical methods described in the PDA Technical Report 33 chapter can all be successfully applied to the rapid microbiological method data sets and gave the same interpretation for equivalence to the standard method. The rapid microbiological method was in general able to pass the requirements of PDA Technical Report 33, though the study shows that there can be occasional outlying results and that caution should be used when applying statistical methods to low average colony-forming unit values. Prior to use in a quality-controlled environment, any new method or technology has to be shown to work as designed by the manufacturer for the purpose required. For new rapid microbiological methods that detect and enumerate contaminating microorganisms, additional recommendations have been provided in the revised PDA Technical Report No. 33. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This paper applies those statistical methods to analyze accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological method system being validated for water bioburden testing. The case study demonstrates that the statistical methods described in the PDA Technical Report No. 33 chapter can be successfully applied to rapid microbiological method data sets and give the same comparability results for similarity or difference as the standard method. © PDA, Inc. 2015.
Bodenburg, Sebastian; Dopslaff, Nina
2008-01-01
The Dysexecutive Questionnaire (DEX, , Behavioral assessment of the dysexecutive syndrome, 1996) is a standardized instrument to measure possible behavioral changes as a result of the dysexecutive syndrome. Although initially intended only as a qualitative instrument, the DEX has also been used increasingly to address quantitative problems. Until now there have not been more fundamental statistical analyses of the questionnaire's testing quality. The present study is based on an unselected sample of 191 patients with acquired brain injury and reports on the data relating to the quality of the items, the reliability and the factorial structure of the DEX. Item 3 displayed too great an item difficulty, whereas item 11 was not sufficiently discriminating. The DEX's reliability in self-rating is r = 0.85. In addition to presenting the statistical values of the tests, a clinical severity classification of the overall scores of the 4 found factors and of the questionnaire as a whole is carried out on the basis of quartile standards.
Meyer, B; Morin, V N; Rödger, H-J; Holah, J; Bird, C
2010-04-01
The results from European standard disinfectant tests are used as one basis to approve the use of disinfectants in Europe. The design of these laboratory-based tests should thus simulate as closely as possible the practical conditions and challenges that the disinfectants would encounter in use. No evidence is available that the organic and microbial loading in these tests simulates actual levels in the food service sector. Total organic carbon (TOC) and total viable count (TVC) were determined on 17 visibly clean and 45 visibly dirty surfaces in two restaurants and the food preparation surfaces of a large retail store. These values were compared to reference values recovered from surfaces soiled with the organic and microbial loading, following the standard conditions of the European Surface Test for bactericidal efficacy, EN 13697. The TOC reference values for clean and dirty conditions were higher than the data from practice, but cannot be regarded as statistical outliers. This was considered as a conservative assessment; however, as additional nine TOC samples from visibly dirty surfaces were discarded from the analysis, as their loading made them impossible to process. Similarly, the recovery of test organisms from surfaces contaminated according to EN 13697 was higher than the TVC from visibly dirty surfaces in practice; though they could not be regarded as statistical outliers of the whole data field. No correlation was found between TVC and TOC in the sampled data, which re-emphasizes the potential presence of micro-organisms on visibly clean surfaces and thus the need for the same degree of disinfection as visibly dirty surfaces. The organic soil and the microbial burden used in EN disinfectant standards represent a realistic worst-case scenario for disinfectants used in the food service and food-processing areas.
Dimova, Violeta; Oertel, Bruno G; Lötsch, Jörn
2017-01-01
Skin sensitivity to sensory stimuli varies among different body areas. A standardized clinical quantitative sensory testing (QST) battery, established for the diagnosis of neuropathic pain, was used to assess whether the magnitude of differences between test sites reaches clinical significance. Ten different sensory QST measures derived from thermal and mechanical stimuli were obtained from 21 healthy volunteers (10 men) and used to create somatosensory profiles bilateral from the dorsum of the hands (the standard area for the assessment of normative values for the upper extremities as proposed by the German Research Network on Neuropathic Pain) and bilateral at volar forearms as a neighboring nonstandard area. The parameters obtained were statistically compared between test sites. Three of the 10 QST parameters differed significantly with respect to the "body area," that is, warmth detection, thermal sensory limen, and mechanical pain thresholds. After z-transformation and interpretation according to the QST battery's standard instructions, 22 abnormal values were obtained at the hand. Applying the same procedure to parameters assessed at the nonstandard site forearm, that is, z-transforming them to the reference values for the hand, 24 measurements values emerged as abnormal, which was not significantly different compared with the hand (P=0.4185). Sensory differences between neighboring body areas are statistically significant, reproducing prior knowledge. This has to be considered in scientific assessments where a small variation of the tested body areas may not be an option. However, the magnitude of these differences was below the difference in sensory parameters that is judged as abnormal, indicating a robustness of the QST instrument against protocol deviations with respect to the test area when using the method of comparison with a 95 % confidence interval of a reference dataset.
ERIC Educational Resources Information Center
Sklar, Jeffrey C.; Zwick, Rebecca
2009-01-01
Proper interpretation of standardized test scores is a crucial skill for K-12 teachers and school personnel; however, many do not have sufficient knowledge of measurement concepts to appropriately interpret and communicate test results. In a recent four-year project funded by the National Science Foundation, three web-based instructional…
The Influence of Ability Grouping on Math Achievement in a Rural Middle School
ERIC Educational Resources Information Center
Pritchard, Robert R.
2012-01-01
The researcher examined the academic performance of low-tracked students (n = 156) using standardized math test scores to determine whether there is a statistically significant difference in achievement depending on academic environment, tracked or nontracked. An analysis of variance (ANOVA) was calculated, using a paired samples t-test for a…
Observation of the rare Bs0 →µ+µ- decay from the combined analysis of CMS and LHCb data
NASA Astrophysics Data System (ADS)
Cms Collaboration; Khachatryan, V.; Sirunyan, A. M.; Tumasyan, A.; Adam, W.; Bergauer, T.; Dragicevic, M.; Erö, J.; Friedl, M.; Frühwirth, R.; Ghete, V. M.; Hartl, C.; Hörmann, N.; Hrubec, J.; Jeitler, M.; Kiesenhofer, W.; Knünz, V.; Krammer, M.; Krätschmer, I.; Liko, D.; Mikulec, I.; Rabady, D.; Rahbaran, B.; Rohringer, H.; Schöfbeck, R.; Strauss, J.; Treberer-Treberspurg, W.; Waltenberger, W.; Wulz, C.-E.; Mossolov, V.; Shumeiko, N.; Suarez Gonzalez, J.; Alderweireldt, S.; Bansal, S.; Cornelis, T.; de Wolf, E. A.; Janssen, X.; Knutsson, A.; Lauwers, J.; Luyckx, S.; Ochesanu, S.; Rougny, R.; van de Klundert, M.; van Haevermaet, H.; van Mechelen, P.; van Remortel, N.; van Spilbeeck, A.; Blekman, F.; Blyweert, S.; D'Hondt, J.; Daci, N.; Heracleous, N.; Keaveney, J.; Lowette, S.; Maes, M.; Olbrechts, A.; Python, Q.; Strom, D.; Tavernier, S.; van Doninck, W.; van Mulders, P.; van Onsem, G. P.; Villella, I.; Caillol, C.; Clerbaux, B.; de Lentdecker, G.; Dobur, D.; Favart, L.; Gay, A. P. R.; Grebenyuk, A.; Léonard, A.; Mohammadi, A.; Perniè, L.; Randle-Conde, A.; Reis, T.; Seva, T.; Thomas, L.; Vander Velde, C.; Vanlaer, P.; Wang, J.; Zenoni, F.; Adler, V.; Beernaert, K.; Benucci, L.; Cimmino, A.; Costantini, S.; Crucy, S.; Dildick, S.; Fagot, A.; Garcia, G.; McCartin, J.; Ocampo Rios, A. A.; Ryckbosch, D.; Salva Diblen, S.; Sigamani, M.; Strobbe, N.; Thyssen, F.; Tytgat, M.; Yazgan, E.; Zaganidis, N.; Basegmez, S.; Beluffi, C.; Bruno, G.; Castello, R.; Caudron, A.; Ceard, L.; da Silveira, G. G.; Delaere, C.; Du Pree, T.; Favart, D.; Forthomme, L.; Giammanco, A.; Hollar, J.; Jafari, A.; Jez, P.; Komm, M.; Lemaitre, V.; Nuttens, C.; Pagano, D.; Perrini, L.; Pin, A.; Piotrzkowski, K.; Popov, A.; Quertenmont, L.; Selvaggi, M.; Vidal Marono, M.; Vizan Garcia, J. M.; Beliy, N.; Caebergs, T.; Daubie, E.; Hammad, G. H.; Aldá Júnior, W. L.; Alves, G. A.; Brito, L.; Correa Martins Junior, M.; Dos Reis Martins, T.; Mora Herrera, C.; Pol, M. E.; Rebello Teles, P.; Carvalho, W.; Chinellato, J.; Custódio, A.; da Costa, E. M.; de Jesus Damiao, D.; de Oliveira Martins, C.; Fonseca de Souza, S.; Malbouisson, H.; Matos Figueiredo, D.; Mundim, L.; Nogima, H.; Prado da Silva, W. L.; Santaolalla, J.; Santoro, A.; Sznajder, A.; Tonelli Manganote, E. J.; Vilela Pereira, A.; Bernardes, C. A.; Dogra, S.; Fernandez Perez Tomei, T. R.; Gregores, E. M.; Mercadante, P. G.; Novaes, S. F.; Padula, Sandra S.; Aleksandrov, A.; Genchev, V.; Hadjiiska, R.; Iaydjiev, P.; Marinov, A.; Piperov, S.; Rodozov, M.; Sultanov, G.; Vutova, M.; Dimitrov, A.; Glushkov, I.; Litov, L.; Pavlov, B.; Petkov, P.; Bian, J. G.; Chen, G. M.; Chen, H. S.; Chen, M.; Cheng, T.; Du, R.; Jiang, C. H.; Plestina, R.; Romeo, F.; Tao, J.; Wang, Z.; Asawatangtrakuldee, C.; Ban, Y.; Li, Q.; Liu, S.; Mao, Y.; Qian, S. J.; Wang, D.; Xu, Z.; Zou, W.; Avila, C.; Cabrera, A.; Chaparro Sierra, L. F.; Florez, C.; Gomez, J. P.; Gomez Moreno, B.; Sanabria, J. C.; Godinovic, N.; Lelas, D.; Polic, D.; Puljak, I.; Antunovic, Z.; Kovac, M.; Brigljevic, V.; Kadija, K.; Luetic, J.; Mekterovic, D.; Sudic, L.; Attikis, A.; Mavromanolakis, G.; Mousa, J.; Nicolaou, C.; Ptochos, F.; Razis, P. A.; Bodlak, M.; Finger, M.; Finger, M., Jr.; Assran, Y.; Ellithi Kamel, A.; Mahmoud, M. A.; Radi, A.; Kadastik, M.; Murumaa, M.; Raidal, M.; Tiko, A.; Eerola, P.; Fedi, G.; Voutilainen, M.; Härkönen, J.; Karimäki, V.; Kinnunen, R.; Kortelainen, M. J.; Lampén, T.; Lassila-Perini, K.; Lehti, S.; Lindén, T.; Luukka, P.; Mäenpää, T.; Peltola, T.; Tuominen, E.; Tuominiemi, J.; Tuovinen, E.; Wendland, L.; Talvitie, J.; Tuuva, T.; Besancon, M.; Couderc, F.; Dejardin, M.; Denegri, D.; Fabbro, B.; Faure, J. L.; Favaro, C.; Ferri, F.; Ganjour, S.; Givernaud, A.; Gras, P.; Hamel de Monchenault, G.; Jarry, P.; Locci, E.; Malcles, J.; Rander, J.; Rosowsky, A.; Titov, M.; Baffioni, S.; Beaudette, F.; Busson, P.; Charlot, C.; Dahms, T.; Dalchenko, M.; Dobrzynski, L.; Filipovic, N.; Florent, A.; Granier de Cassagnac, R.; Mastrolorenzo, L.; Miné, P.; Mironov, C.; Naranjo, I. N.; Nguyen, M.; Ochando, C.; Ortona, G.; Paganini, P.; Regnard, S.; Salerno, R.; Sauvan, J. B.; Sirois, Y.; Veelken, C.; Yilmaz, Y.; Zabi, A.; Agram, J.-L.; Andrea, J.; Aubin, A.; Bloch, D.; Brom, J.-M.; Chabert, E. C.; Collard, C.; Conte, E.; Fontaine, J.-C.; Gelé, D.; Goerlach, U.; Goetzmann, C.; Le Bihan, A.-C.; Skovpen, K.; van Hove, P.; Gadrat, S.; Beauceron, S.; Beaupere, N.; Boudoul, G.; Bouvier, E.; Brochet, S.; Carrillo Montoya, C. A.; Chasserat, J.; Chierici, R.; Contardo, D.; Depasse, P.; El Mamouni, H.; Fan, J.; Fay, J.; Gascon, S.; Gouzevitch, M.; Ille, B.; Kurca, T.; Lethuillier, M.; Mirabito, L.; Perries, S.; Ruiz Alvarez, J. D.; Sabes, D.; Sgandurra, L.; Sordini, V.; Vander Donckt, M.; Verdier, P.; Viret, S.; Xiao, H.; Tsamalaidze, Z.; Autermann, C.; Beranek, S.; Bontenackels, M.; Edelhoff, M.; Feld, L.; Heister, A.; Hindrichs, O.; Klein, K.; Ostapchuk, A.; Raupach, F.; Sammet, J.; Schael, S.; Schulte, J. F.; Weber, H.; Wittmer, B.; Zhukov, V.; Ata, M.; Brodski, M.; Dietz-Laursonn, E.; Duchardt, D.; Erdmann, M.; Fischer, R.; Güth, A.; Hebbeker, T.; Heidemann, C.; Hoepfner, K.; Klingebiel, D.; Knutzen, S.; Kreuzer, P.; Merschmeyer, M.; Meyer, A.; Millet, P.; Olschewski, M.; Padeken, K.; Papacz, P.; Reithler, H.; Schmitz, S. A.; Sonnenschein, L.; Teyssier, D.; Thüer, S.; Weber, M.; Cherepanov, V.; Erdogan, Y.; Flügge, G.; Geenen, H.; Geisler, M.; Haj Ahmad, W.; Hoehle, F.; Kargoll, B.; Kress, T.; Kuessel, Y.; Künsken, A.; Lingemann, J.; Nowack, A.; Nugent, I. M.; Pooth, O.; Stahl, A.; Aldaya Martin, M.; Asin, I.; Bartosik, N.; Behr, J.; Behrens, U.; Bell, A. J.; Bethani, A.; Borras, K.; Burgmeier, A.; Cakir, A.; Calligaris, L.; Campbell, A.; Choudhury, S.; Costanza, F.; Diez Pardos, C.; Dolinska, G.; Dooling, S.; Dorland, T.; Eckerlin, G.; Eckstein, D.; Eichhorn, T.; Flucke, G.; Garay Garcia, J.; Geiser, A.; Gunnellini, P.; Hauk, J.; Hempel, M.; Jung, H.; Kalogeropoulos, A.; Kasemann, M.; Katsas, P.; Kieseler, J.; Kleinwort, C.; Korol, I.; Krücker, D.; Lange, W.; Leonard, J.; Lipka, K.; Lobanov, A.; Lohmann, W.; Lutz, B.; Mankel, R.; Marfin, I.; Melzer-Pellmann, I.-A.; Meyer, A. B.; Mittag, G.; Mnich, J.; Mussgiller, A.; Naumann-Emme, S.; Nayak, A.; Ntomari, E.; Perrey, H.; Pitzl, D.; Placakyte, R.; Raspereza, A.; Ribeiro Cipriano, P. M.; Roland, B.; Ron, E.; Sahin, M. Ö.; Salfeld-Nebgen, J.; Saxena, P.; Schoerner-Sadenius, T.; Schröder, M.; Seitz, C.; Spannagel, S.; Vargas Trevino, A. D. R.; Walsh, R.; Wissing, C.; Blobel, V.; Centis Vignali, M.; Draeger, A. R.; Erfle, J.; Garutti, E.; Goebel, K.; Görner, M.; Haller, J.; Hoffmann, M.; Höing, R. S.; Junkes, A.; Kirschenmann, H.; Klanner, R.; Kogler, R.; Lange, J.; Lapsien, T.; Lenz, T.; Marchesini, I.; Ott, J.; Peiffer, T.; Perieanu, A.; Pietsch, N.; Poehlsen, J.; Poehlsen, T.; Rathjens, D.; Sander, C.; Schettler, H.; Schleper, P.; Schlieckau, E.; Schmidt, A.; Seidel, M.; Sola, V.; Stadie, H.; Steinbrück, G.; Troendle, D.; Usai, E.; Vanelderen, L.; Vanhoefer, A.; Barth, C.; Baus, C.; Berger, J.; Böser, C.; Butz, E.; Chwalek, T.; de Boer, W.; Descroix, A.; Dierlamm, A.; Feindt, M.; Frensch, F.; Giffels, M.; Gilbert, A.; Hartmann, F.; Hauth, T.; Husemann, U.; Katkov, I.; Kornmayer, A.; Kuznetsova, E.; Lobelle Pardo, P.; Mozer, M. U.; Müller, T.; Müller, Th.; Nürnberg, A.; Quast, G.; Rabbertz, K.; Röcker, S.; Simonis, H. J.; Stober, F. M.; Ulrich, R.; Wagner-Kuhr, J.; Wayand, S.; Weiler, T.; Wolf, R.; Anagnostou, G.; Daskalakis, G.; Geralis, T.; Giakoumopoulou, V. A.; Kyriakis, A.; Loukas, D.; Markou, A.; Markou, C.; Psallidas, A.; Topsis-Giotis, I.; Agapitos, A.; Kesisoglou, S.; Panagiotou, A.; Saoulidou, N.; Stiliaris, E.; Aslanoglou, X.; Evangelou, I.; Flouris, G.; Foudas, C.; Kokkas, P.; Manthos, N.; Papadopoulos, I.; Paradas, E.; Strologas, J.; Bencze, G.; Hajdu, C.; Hidas, P.; Horvath, D.; Sikler, F.; Veszpremi, V.; Vesztergombi, G.; Zsigmond, A. J.; Beni, N.; Czellar, S.; Karancsi, J.; Molnar, J.; Palinkas, J.; Szillasi, Z.; Makovec, A.; Raics, P.; Trocsanyi, Z. L.; Ujvari, B.; Sahoo, N.; Swain, S. K.; Beri, S. B.; Bhatnagar, V.; Gupta, R.; Bhawandeep, U.; Kalsi, A. K.; Kaur, M.; Kumar, R.; Mittal, M.; Nishu, N.; Singh, J. B.; Ashok Kumar; Arun Kumar; Ahuja, S.; Bhardwaj, A.; Choudhary, B. C.; Kumar, A.; Malhotra, S.; Naimuddin, M.; Ranjan, K.; Sharma, V.; Banerjee, S.; Bhattacharya, S.; Chatterjee, K.; Dutta, S.; Gomber, B.; Jain, Sa.; Jain, Sh.; Khurana, R.; Modak, A.; Mukherjee, S.; Roy, D.; Sarkar, S.; Sharan, M.; Abdulsalam, A.; Dutta, D.; Kailas, S.; Kumar, V.; Mohanty, A. K.; Pant, L. M.; Shukla, P.; Topkar, A.; Aziz, T.; Banerjee, S.; Bhowmik, S.; Chatterjee, R. M.; Dewanjee, R. K.; Dugad, S.; Ganguly, S.; Ghosh, S.; Guchait, M.; Gurtu, A.; Kole, G.; Kumar, S.; Maity, M.; Majumder, G.; Mazumdar, K.; Mohanty, G. B.; Parida, B.; Sudhakar, K.; Wickramage, N.; Bakhshiansohi, H.; Behnamian, H.; Etesami, S. M.; Fahim, A.; Goldouzian, R.; Khakzad, M.; Mohammadi Najafabadi, M.; Naseri, M.; Paktinat Mehdiabadi, S.; Rezaei Hosseinabadi, F.; Safarzadeh, B.; Zeinali, M.; Felcini, M.; Grunewald, M.; Abbrescia, M.; Calabria, C.; Chhibra, S. S.; Colaleo, A.; Creanza, D.; de Filippis, N.; de Palma, M.; Fiore, L.; Iaselli, G.; Maggi, G.; Maggi, M.; My, S.; Nuzzo, S.; Pompili, A.; Pugliese, G.; Radogna, R.; Selvaggi, G.; Sharma, A.; Silvestris, L.; Venditti, R.; Verwilligen, P.; Abbiendi, G.; Benvenuti, A. C.; Bonacorsi, D.; Braibant-Giacomelli, S.; Brigliadori, L.; Campanini, R.; Capiluppi, P.; Castro, A.; Cavallo, F. R.; Codispoti, G.; Cuffiani, M.; Dallavalle, G. M.; Fabbri, F.; Fanfani, A.; Fasanella, D.; Giacomelli, P.; Grandi, C.; Guiducci, L.; Marcellini, S.; Masetti, G.; Montanari, A.; Navarria, F. L.; Perrotta, A.; Primavera, F.; Rossi, A. M.; Rovelli, T.; Siroli, G. P.; Tosi, N.; Travaglini, R.; Albergo, S.; Cappello, G.; Chiorboli, M.; Costa, S.; Giordano, F.; Potenza, R.; Tricomi, A.; Tuve, C.; Barbagli, G.; Ciulli, V.; Civinini, C.; D'Alessandro, R.; Focardi, E.; Gallo, E.; Gonzi, S.; Gori, V.; Lenzi, P.; Meschini, M.; Paoletti, S.; Sguazzoni, G.; Tropiano, A.; Benussi, L.; Bianco, S.; Fabbri, F.; Piccolo, D.; Ferretti, R.; Ferro, F.; Lo Vetere, M.; Robutti, E.; Tosi, S.; Dinardo, M. E.; Fiorendi, S.; Gennai, S.; Gerosa, R.; Ghezzi, A.; Govoni, P.; Lucchini, M. T.; Malvezzi, S.; Manzoni, R. A.; Martelli, A.; Marzocchi, B.; Menasce, D.; Moroni, L.; Paganoni, M.; Pedrini, D.; Ragazzi, S.; Redaelli, N.; Tabarelli de Fatis, T.; Buontempo, S.; Cavallo, N.; di Guida, S.; Fabozzi, F.; Iorio, A. O. M.; Lista, L.; Meola, S.; Merola, M.; Paolucci, P.; Azzi, P.; Bacchetta, N.; Bisello, D.; Branca, A.; Carlin, R.; Checchia, P.; Dall'Osso, M.; Dorigo, T.; Dosselli, U.; Galanti, M.; Gasparini, F.; Gasparini, U.; Giubilato, P.; Gozzelino, A.; Kanishchev, K.; Lacaprara, S.; Margoni, M.; Meneguzzo, A. T.; Pazzini, J.; Pozzobon, N.; Ronchese, P.; Simonetto, F.; Torassa, E.; Tosi, M.; Zotto, P.; Zucchetta, A.; Zumerle, G.; Gabusi, M.; Ratti, S. P.; Re, V.; Riccardi, C.; Salvini, P.; Vitulo, P.; Biasini, M.; Bilei, G. M.; Ciangottini, D.; Fanò, L.; Lariccia, P.; Mantovani, G.; Menichelli, M.; Saha, A.; Santocchia, A.; Spiezia, A.; Androsov, K.; Azzurri, P.; Bagliesi, G.; Bernardini, J.; Boccali, T.; Broccolo, G.; Castaldi, R.; Ciocci, M. A.; Dell'Orso, R.; Donato, S.; Fiori, F.; Foà, L.; Giassi, A.; Grippo, M. T.; Ligabue, F.; Lomtadze, T.; Martini, L.; Messineo, A.; Moon, C. S.; Palla, F.; Rizzi, A.; Savoy-Navarro, A.; Serban, A. T.; Spagnolo, P.; Squillacioti, P.; Tenchini, R.; Tonelli, G.; Venturi, A.; Verdini, P. G.; Vernieri, C.; Barone, L.; Cavallari, F.; D'Imperio, G.; Del Re, D.; Diemoz, M.; Jorda, C.; Longo, E.; Margaroli, F.; Meridiani, P.; Micheli, F.; Nourbakhsh, S.; Organtini, G.; Paramatti, R.; Rahatlou, S.; Rovelli, C.; Santanastasio, F.; Soffi, L.; Traczyk, P.; Amapane, N.; Arcidiacono, R.; Argiro, S.; Arneodo, M.; Bellan, R.; Biino, C.; Cartiglia, N.; Casasso, S.; Costa, M.; Degano, A.; Demaria, N.; Finco, L.; Mariotti, C.; Maselli, S.; Migliore, E.; Monaco, V.; Musich, M.; Obertino, M. M.; Pacher, L.; Pastrone, N.; Pelliccioni, M.; Pinna Angioni, G. L.; Potenza, A.; Romero, A.; Ruspa, M.; Sacchi, R.; Solano, A.; Staiano, A.; Tamponi, U.; Belforte, S.; Candelise, V.; Casarsa, M.; Cossutti, F.; Della Ricca, G.; Gobbo, B.; La Licata, C.; Marone, M.; Schizzi, A.; Umer, T.; Zanetti, A.; Chang, S.; Kropivnitskaya, A.; Nam, S. K.; Kim, D. H.; Kim, G. N.; Kim, M. S.; Kong, D. J.; Lee, S.; Oh, Y. D.; Park, H.; Sakharov, A.; Son, D. C.; Kim, T. J.; Kim, J. Y.; Song, S.; Choi, S.; Gyun, D.; Hong, B.; Jo, M.; Kim, H.; Kim, Y.; Lee, B.; Lee, K. S.; Park, S. K.; Roh, Y.; Yoo, H. D.; Choi, M.; Kim, J. H.; Park, I. C.; Ryu, G.; Ryu, M. S.; Choi, Y.; Choi, Y. K.; Goh, J.; Kim, D.; Kwon, E.; Lee, J.; Yu, I.; Juodagalvis, A.; Komaragiri, J. R.; Md Ali, M. A. B.; Casimiro Linares, E.; Castilla-Valdez, H.; de La Cruz-Burelo, E.; Heredia-de La Cruz, I.; Hernandez-Almada, A.; Lopez-Fernandez, R.; Sanchez-Hernandez, A.; Carrillo Moreno, S.; Vazquez Valencia, F.; Pedraza, I.; Salazar Ibarguen, H. A.; Morelos Pineda, A.; Krofcheck, D.; Butler, P. H.; Reucroft, S.; Ahmad, A.; Ahmad, M.; Hassan, Q.; Hoorani, H. R.; Khan, W. A.; Khurshid, T.; Shoaib, M.; Bialkowska, H.; Bluj, M.; Boimska, B.; Frueboes, T.; Górski, M.; Kazana, M.; Nawrocki, K.; Romanowska-Rybinska, K.; Szleper, M.; Zalewski, P.; Brona, G.; Bunkowski, K.; Cwiok, M.; Dominik, W.; Doroba, K.; Kalinowski, A.; Konecki, M.; Krolikowski, J.; Misiura, M.; Olszewski, M.; Wolszczak, W.; Bargassa, P.; Beirão da Cruz E Silva, C.; Faccioli, P.; Ferreira Parracho, P. G.; Gallinaro, M.; Lloret Iglesias, L.; Nguyen, F.; Rodrigues Antunes, J.; Seixas, J.; Varela, J.; Vischia, P.; Afanasiev, S.; Bunin, P.; Gavrilenko, M.; Golutvin, I.; Gorbunov, I.; Kamenev, A.; Karjavin, V.; Konoplyanikov, V.; Lanev, A.; Malakhov, A.; Matveev, V.; Moisenz, P.; Palichik, V.; Perelygin, V.; Shmatov, S.; Skatchkov, N.; Smirnov, V.; Zarubin, A.; Golovtsov, V.; Ivanov, Y.; Kim, V.; Levchenko, P.; Murzin, V.; Oreshkin, V.; Smirnov, I.; Sulimov, V.; Uvarov, L.; Vavilov, S.; Vorobyev, A.; Vorobyev, An.; Andreev, Yu.; Dermenev, A.; Gninenko, S.; Golubev, N.; Kirsanov, M.; Krasnikov, N.; Pashenkov, A.; Tlisov, D.; Toropin, A.; Epshteyn, V.; Gavrilov, V.; Lychkovskaya, N.; Popov, V.; Pozdnyakov, I.; Safronov, G.; Semenov, S.; Spiridonov, A.; Stolin, V.; Vlasov, E.; Zhokin, A.; Andreev, V.; Azarkin, M.; Dremin, I.; Kirakosyan, M.; Leonidov, A.; Mesyats, G.; Rusakov, S. V.; Vinogradov, A.; Belyaev, A.; Boos, E.; Dubinin, M.; Dudko, L.; Ershov, A.; Gribushin, A.; Klyukhin, V.; Kodolova, O.; Lokhtin, I.; Obraztsov, S.; Petrushanko, S.; Savrin, V.; Snigirev, A.; Azhgirey, I.; Bayshev, I.; Bitioukov, S.; Kachanov, V.; Kalinin, A.; Konstantinov, D.; Krychkine, V.; Petrov, V.; Ryutin, R.; Sobol, A.; Tourtchanovitch, L.; Troshin, S.; Tyurin, N.; Uzunian, A.; Volkov, A.; Adzic, P.; Ekmedzic, M.; Milosevic, J.; Rekovic, V.; Alcaraz Maestre, J.; Battilana, C.; Calvo, E.; Cerrada, M.; Chamizo Llatas, M.; Colino, N.; de La Cruz, B.; Delgado Peris, A.; Domínguez Vázquez, D.; Escalante Del Valle, A.; Fernandez Bedoya, C.; Fernández Ramos, J. P.; Flix, J.; Fouz, M. C.; Garcia-Abia, P.; Gonzalez Lopez, O.; Goy Lopez, S.; Hernandez, J. M.; Josa, M. I.; Navarro de Martino, E.; Pérez-Calero Yzquierdo, A.; Puerta Pelayo, J.; Quintario Olmeda, A.; Redondo, I.; Romero, L.; Soares, M. S.; Albajar, C.; de Trocóniz, J. F.; Missiroli, M.; Moran, D.; Brun, H.; Cuevas, J.; Fernandez Menendez, J.; Folgueras, S.; Gonzalez Caballero, I.; Brochero Cifuentes, J. A.; Cabrillo, I. J.; Calderon, A.; Duarte Campderros, J.; Fernandez, M.; Gomez, G.; Graziano, A.; Lopez Virto, A.; Marco, J.; Marco, R.; Martinez Rivero, C.; Matorras, F.; Munoz Sanchez, F. J.; Piedra Gomez, J.; Rodrigo, T.; Rodríguez-Marrero, A. Y.; Ruiz-Jimeno, A.; Scodellaro, L.; Vila, I.; Vilar Cortabitarte, R.; Abbaneo, D.; Auffray, E.; Auzinger, G.; Bachtis, M.; Baillon, P.; Ball, A. H.; Barney, D.; Benaglia, A.; Bendavid, J.; Benhabib, L.; Benitez, J. F.; Bernet, C.; Bloch, P.; Bocci, A.; Bonato, A.; Bondu, O.; Botta, C.; Breuker, H.; Camporesi, T.; Cerminara, G.; Colafranceschi, S.; D'Alfonso, M.; D'Enterria, D.; Dabrowski, A.; David, A.; de Guio, F.; de Roeck, A.; de Visscher, S.; di Marco, E.; Dobson, M.; Dordevic, M.; Dupont-Sagorin, N.; Elliott-Peisert, A.; Franzoni, G.; Funk, W.; Gigi, D.; Gill, K.; Giordano, D.; Girone, M.; Glege, F.; Guida, R.; Gundacker, S.; Guthoff, M.; Hammer, J.; Hansen, M.; Harris, P.; Hegeman, J.; Innocente, V.; Janot, P.; Kousouris, K.; Krajczar, K.; Lecoq, P.; Lourenço, C.; Magini, N.; Malgeri, L.; Mannelli, M.; Marrouche, J.; Masetti, L.; Meijers, F.; Mersi, S.; Meschi, E.; Moortgat, F.; Morovic, S.; Mulders, M.; Orsini, L.; Pape, L.; Perez, E.; Perrozzi, L.; Petrilli, A.; Petrucciani, G.; Pfeiffer, A.; Pimiä, M.; Piparo, D.; Plagge, M.; Racz, A.; Rolandi, G.; Rovere, M.; Sakulin, H.; Schäfer, C.; Schwick, C.; Sharma, A.; Siegrist, P.; Silva, P.; Simon, M.; Sphicas, P.; Spiga, D.; Steggemann, J.; Stieger, B.; Stoye, M.; Takahashi, Y.; Treille, D.; Tsirou, A.; Veres, G. I.; Wardle, N.; Wöhri, H. K.; Wollny, H.; Zeuner, W. D.; Bertl, W.; Deiters, K.; Erdmann, W.; Horisberger, R.; Ingram, Q.; Kaestli, H. C.; Kotlinski, D.; Renker, D.; Rohe, T.; Bachmair, F.; Bäni, L.; Bianchini, L.; Buchmann, M. A.; Casal, B.; Chanon, N.; Dissertori, G.; Dittmar, M.; Donegà, M.; Dünser, M.; Eller, P.; Grab, C.; Hits, D.; Hoss, J.; Lustermann, W.; Mangano, B.; Marini, A. C.; Marionneau, M.; Martinez Ruiz Del Arbol, P.; Masciovecchio, M.; Meister, D.; Mohr, N.; Musella, P.; Nägeli, C.; Nessi-Tedaldi, F.; Pandolfi, F.; Pauss, F.; Peruzzi, M.; Quittnat, M.; Rebane, L.; Rossini, M.; Starodumov, A.; Takahashi, M.; Theofilatos, K.; Wallny, R.; Weber, H. A.; Amsler, C.; Canelli, M. F.; Chiochia, V.; de Cosa, A.; Hinzmann, A.; Hreus, T.; Kilminster, B.; Lange, C.; Millan Mejias, B.; Ngadiuba, J.; Pinna, D.; Robmann, P.; Ronga, F. J.; Taroni, S.; Verzetti, M.; Yang, Y.; Cardaci, M.; Chen, K. H.; Ferro, C.; Kuo, C. M.; Lin, W.; Lu, Y. J.; Volpe, R.; Yu, S. S.; Chang, P.; Chang, Y. H.; Chang, Y. W.; Chao, Y.; Chen, K. F.; Chen, P. H.; Dietz, C.; Grundler, U.; Hou, W.-S.; Kao, K. Y.; Liu, Y. F.; Lu, R.-S.; Majumder, D.; Petrakou, E.; Tzeng, Y. M.; Wilken, R.; Asavapibhop, B.; Singh, G.; Srimanobhas, N.; Suwonjandee, N.; Adiguzel, A.; Bakirci, M. N.; Cerci, S.; Dozen, C.; Dumanoglu, I.; Eskut, E.; Girgis, S.; Gokbulut, G.; Gurpinar, E.; Hos, I.; Kangal, E. E.; Kayis Topaksu, A.; Onengut, G.; Ozdemir, K.; Ozturk, S.; Polatoz, A.; Sunar Cerci, D.; Tali, B.; Topakli, H.; Vergili, M.; Akin, I. V.; Bilin, B.; Bilmis, S.; Gamsizkan, H.; Isildak, B.; Karapinar, G.; Ocalan, K.; Sekmen, S.; Surat, U. E.; Yalvac, M.; Zeyrek, M.; Albayrak, E. A.; Gülmez, E.; Kaya, M.; Kaya, O.; Yetkin, T.; Cankocak, K.; Vardarlı, F. I.; Levchuk, L.; Sorokin, P.; Brooke, J. J.; Clement, E.; Cussans, D.; Flacher, H.; Goldstein, J.; Grimes, M.; Heath, G. P.; Heath, H. F.; Jacob, J.; Kreczko, L.; Lucas, C.; Meng, Z.; Newbold, D. M.; Paramesvaran, S.; Poll, A.; Sakuma, T.; Senkin, S.; Smith, V. J.; Bell, K. W.; Belyaev, A.; Brew, C.; Brown, R. M.; Cockerill, D. J. A.; Coughlan, J. A.; Harder, K.; Harper, S.; Olaiya, E.; Petyt, D.; Shepherd-Themistocleous, C. H.; Thea, A.; Tomalin, I. R.; Williams, T.; Womersley, W. J.; Worm, S. D.; Baber, M.; Bainbridge, R.; Buchmuller, O.; Burton, D.; Colling, D.; Cripps, N.; Dauncey, P.; Davies, G.; Della Negra, M.; Dunne, P.; Ferguson, W.; Fulcher, J.; Futyan, D.; Hall, G.; Iles, G.; Jarvis, M.; Karapostoli, G.; Kenzie, M.; Lane, R.; Lucas, R.; Lyons, L.; Magnan, A.-M.; Malik, S.; Mathias, B.; Nash, J.; Nikitenko, A.; Pela, J.; Pesaresi, M.; Petridis, K.; Raymond, D. M.; Rogerson, S.; Rose, A.; Seez, C.; Sharp, P.; Tapper, A.; Vazquez Acosta, M.; Virdee, T.; Zenz, S. C.; Cole, J. E.; Hobson, P. R.; Khan, A.; Kyberd, P.; Leggat, D.; Leslie, D.; Reid, I. D.; Symonds, P.; Teodorescu, L.; Turner, M.; Dittmann, J.; Hatakeyama, K.; Kasmi, A.; Liu, H.; Scarborough, T.; Charaf, O.; Cooper, S. I.; Henderson, C.; Rumerio, P.; Avetisyan, A.; Bose, T.; Fantasia, C.; Lawson, P.; Richardson, C.; Rohlf, J.; St. John, J.; Sulak, L.; Alimena, J.; Berry, E.; Bhattacharya, S.; Christopher, G.; Cutts, D.; Demiragli, Z.; Dhingra, N.; Ferapontov, A.; Garabedian, A.; Heintz, U.; Kukartsev, G.; Laird, E.; Landsberg, G.; Luk, M.; Narain, M.; Segala, M.; Sinthuprasith, T.; Speer, T.; Swanson, J.; Breedon, R.; Breto, G.; Calderon de La Barca Sanchez, M.; Chauhan, S.; Chertok, M.; Conway, J.; Conway, R.; Cox, P. T.; Erbacher, R.; Gardner, M.; Ko, W.; Lander, R.; Mulhearn, M.; Pellett, D.; Pilot, J.; Ricci-Tam, F.; Shalhout, S.; Smith, J.; Squires, M.; Stolp, D.; Tripathi, M.; Wilbur, S.; Yohay, R.; Cousins, R.; Everaerts, P.; Farrell, C.; Hauser, J.; Ignatenko, M.; Rakness, G.; Takasugi, E.; Valuev, V.; Weber, M.; Burt, K.; Clare, R.; Ellison, J.; Gary, J. W.; Hanson, G.; Heilman, J.; Ivova Rikova, M.; Jandir, P.; Kennedy, E.; Lacroix, F.; Long, O. R.; Luthra, A.; Malberti, M.; Olmedo Negrete, M.; Shrinivas, A.; Sumowidagdo, S.; Wimpenny, S.; Branson, J. G.; Cerati, G. B.; Cittolin, S.; D'Agnolo, R. T.; Holzner, A.; Kelley, R.; Klein, D.; Kovalskyi, D.; Letts, J.; MacNeill, I.; Olivito, D.; Padhi, S.; Palmer, C.; Pieri, M.; Sani, M.; Sharma, V.; Simon, S.; Tu, Y.; Vartak, A.; Welke, C.; Würthwein, F.; Yagil, A.; Barge, D.; Bradmiller-Feld, J.; Campagnari, C.; Danielson, T.; Dishaw, A.; Dutta, V.; Flowers, K.; Franco Sevilla, M.; Geffert, P.; George, C.; Golf, F.; Gouskos, L.; Incandela, J.; Justus, C.; McColl, N.; Richman, J.; Stuart, D.; To, W.; West, C.; Yoo, J.; Apresyan, A.; Bornheim, A.; Bunn, J.; Chen, Y.; Duarte, J.; Mott, A.; Newman, H. B.; Pena, C.; Pierini, M.; Spiropulu, M.; Vlimant, J. R.; Wilkinson, R.; Xie, S.; Zhu, R. Y.; Azzolini, V.; Calamba, A.; Carlson, B.; Ferguson, T.; Iiyama, Y.; Paulini, M.; Russ, J.; Vogel, H.; Vorobiev, I.; Cumalat, J. P.; Ford, W. T.; Gaz, A.; Krohn, M.; Luiggi Lopez, E.; Nauenberg, U.; Smith, J. G.; Stenson, K.; Wagner, S. R.; Alexander, J.; Chatterjee, A.; Chaves, J.; Chu, J.; Dittmer, S.; Eggert, N.; Mirman, N.; Nicolas Kaufman, G.; Patterson, J. R.; Ryd, A.; Salvati, E.; Skinnari, L.; Sun, W.; Teo, W. D.; Thom, J.; Thompson, J.; Tucker, J.; Weng, Y.; Winstrom, L.; Wittich, P.; Winn, D.; Abdullin, S.; Albrow, M.; Anderson, J.; Apollinari, G.; Bauerdick, L. A. T.; Beretvas, A.; Berryhill, J.; Bhat, P. C.; Bolla, G.; Burkett, K.; Butler, J. N.; Cheung, H. W. K.; Chlebana, F.; Cihangir, S.; Elvira, V. D.; Fisk, I.; Freeman, J.; Gao, Y.; Gottschalk, E.; Gray, L.; Green, D.; Grünendahl, S.; Gutsche, O.; Hanlon, J.; Hare, D.; Harris, R. M.; Hirschauer, J.; Hooberman, B.; Jindariani, S.; Johnson, M.; Joshi, U.; Kaadze, K.; Klima, B.; Kreis, B.; Kwan, S.; Linacre, J.; Lincoln, D.; Lipton, R.; Liu, T.; Lykken, J.; Maeshima, K.; Marraffino, J. M.; Martinez Outschoorn, V. I.; Maruyama, S.; Mason, D.; McBride, P.; Merkel, P.; Mishra, K.; Mrenna, S.; Nahn, S.; Newman-Holmes, C.; O'Dell, V.; Prokofyev, O.; Sexton-Kennedy, E.; Sharma, S.; Soha, A.; Spalding, W. J.; Spiegel, L.; Taylor, L.; Tkaczyk, S.; Tran, N. V.; Uplegger, L.; Vaandering, E. W.; Vidal, R.; Whitbeck, A.; Whitmore, J.; Yang, F.; Acosta, D.; Avery, P.; Bortignon, P.; Bourilkov, D.; Carver, M.; Curry, D.; Das, S.; de Gruttola, M.; di Giovanni, G. P.; Field, R. D.; Fisher, M.; Furic, I. K.; Hugon, J.; Konigsberg, J.; Korytov, A.; Kypreos, T.; Low, J. F.; Matchev, K.; Mei, H.; Milenovic, P.; Mitselmakher, G.; Muniz, L.; Rinkevicius, A.; Shchutska, L.; Snowball, M.; Sperka, D.; Yelton, J.; Zakaria, M.; Hewamanage, S.; Linn, S.; Markowitz, P.; Martinez, G.; Rodriguez, J. L.; Adams, T.; Askew, A.; Bochenek, J.; Diamond, B.; Haas, J.; Hagopian, S.; Hagopian, V.; Johnson, K. F.; Prosper, H.; Veeraraghavan, V.; Weinberg, M.; Baarmand, M. M.; Hohlmann, M.; Kalakhety, H.; Yumiceva, F.; Adams, M. R.; Apanasevich, L.; Berry, D.; Betts, R. R.; Bucinskaite, I.; Cavanaugh, R.; Evdokimov, O.; Gauthier, L.; Gerber, C. E.; Hofman, D. J.; Kurt, P.; Moon, D. H.; O'Brien, C.; Sandoval Gonzalez, I. D.; Silkworth, C.; Turner, P.; Varelas, N.; Bilki, B.; Clarida, W.; Dilsiz, K.; Haytmyradov, M.; Merlo, J.-P.; Mermerkaya, H.; Mestvirishvili, A.; Moeller, A.; Nachtman, J.; Ogul, H.; Onel, Y.; Ozok, F.; Penzo, A.; Rahmat, R.; Sen, S.; Tan, P.; Tiras, E.; Wetzel, J.; Yi, K.; Barnett, B. A.; Blumenfeld, B.; Bolognesi, S.; Fehling, D.; Gritsan, A. V.; Maksimovic, P.; Martin, C.; Swartz, M.; Baringer, P.; Bean, A.; Benelli, G.; Bruner, C.; Kenny, R. P., III; Malek, M.; Murray, M.; Noonan, D.; Sanders, S.; Sekaric, J.; Stringer, R.; Wang, Q.; Wood, J. S.; Chakaberia, I.; Ivanov, A.; Khalil, S.; Makouski, M.; Maravin, Y.; Saini, L. K.; Skhirtladze, N.; Svintradze, I.; Gronberg, J.; Lange, D.; Rebassoo, F.; Wright, D.; Baden, A.; Belloni, A.; Calvert, B.; Eno, S. C.; Gomez, J. A.; Hadley, N. J.; Kellogg, R. G.; Kolberg, T.; Lu, Y.; Mignerey, A. C.; Pedro, K.; Skuja, A.; Tonjes, M. B.; Tonwar, S. C.; Apyan, A.; Barbieri, R.; Bauer, G.; Busza, W.; Cali, I. A.; Chan, M.; Di Matteo, L.; Gomez Ceballos, G.; Goncharov, M.; Gulhan, D.; Klute, M.; Lai, Y. S.; Lee, Y.-J.; Levin, A.; Luckey, P. D.; Ma, T.; Paus, C.; Ralph, D.; Roland, C.; Roland, G.; Stephans, G. S. F.; Sumorok, K.; Velicanu, D.; Veverka, J.; Wyslouch, B.; Yang, M.; Zanetti, M.; Zhukova, V.; Dahmes, B.; Gude, A.; Kao, S. C.; Klapoetke, K.; Kubota, Y.; Mans, J.; Pastika, N.; Rusack, R.; Singovsky, A.; Tambe, N.; Turkewitz, J.; Acosta, J. G.; Oliveros, S.; Avdeeva, E.; Bloom, K.; Bose, S.; Claes, D. R.; Dominguez, A.; Gonzalez Suarez, R.; Keller, J.; Knowlton, D.; Kravchenko, I.; Lazo-Flores, J.; Meier, F.; Ratnikov, F.; Snow, G. R.; Zvada, M.; Dolen, J.; Godshalk, A.; Iashvili, I.; Kharchilava, A.; Kumar, A.; Rappoccio, S.; Alverson, G.; Barberis, E.; Baumgartel, D.; Chasco, M.; Massironi, A.; Morse, D. M.; Nash, D.; Orimoto, T.; Trocino, D.; Wang, R.-J.; Wood, D.; Zhang, J.; Hahn, K. A.; Kubik, A.; Mucia, N.; Odell, N.; Pollack, B.; Pozdnyakov, A.; Schmitt, M.; Stoynev, S.; Sung, K.; Velasco, M.; Won, S.; Brinkerhoff, A.; Chan, K. M.; Drozdetskiy, A.; Hildreth, M.; Jessop, C.; Karmgard, D. J.; Kellams, N.; Lannon, K.; Lynch, S.; Marinelli, N.; Musienko, Y.; Pearson, T.; Planer, M.; Ruchti, R.; Smith, G.; Valls, N.; Wayne, M.; Wolf, M.; Woodard, A.; Antonelli, L.; Brinson, J.; Bylsma, B.; Durkin, L. S.; Flowers, S.; Hart, A.; Hill, C.; Hughes, R.; Kotov, K.; Ling, T. Y.; Luo, W.; Puigh, D.; Rodenburg, M.; Winer, B. L.; Wolfe, H.; Wulsin, H. W.; Driga, O.; Elmer, P.; Hardenbrook, J.; Hebda, P.; Hunt, A.; Koay, S. A.; Lujan, P.; Marlow, D.; Medvedeva, T.; Mooney, M.; Olsen, J.; Piroué, P.; Quan, X.; Saka, H.; Stickland, D.; Tully, C.; Werner, J. S.; Zuranski, A.; Brownson, E.; Malik, S.; Mendez, H.; Ramirez Vargas, J. E.; Barnes, V. E.; Benedetti, D.; Bortoletto, D.; de Mattia, M.; Gutay, L.; Hu, Z.; Jha, M. K.; Jones, M.; Jung, K.; Kress, M.; Leonardo, N.; Miller, D. H.; Neumeister, N.; Radburn-Smith, B. C.; Shi, X.; Shipsey, I.; Silvers, D.; Svyatkovskiy, A.; Wang, F.; Xie, W.; Xu, L.; Zablocki, J.; Parashar, N.; Stupak, J.; Adair, A.; Akgun, B.; Ecklund, K. M.; Geurts, F. J. M.; Li, W.; Michlin, B.; Padley, B. P.; Redjimi, R.; Roberts, J.; Zabel, J.; Betchart, B.; Bodek, A.; Covarelli, R.; de Barbaro, P.; Demina, R.; Eshaq, Y.; Ferbel, T.; Garcia-Bellido, A.; Goldenzweig, P.; Han, J.; Harel, A.; Khukhunaishvili, A.; Korjenevski, S.; Petrillo, G.; Vishnevskiy, D.; Ciesielski, R.; Demortier, L.; Goulianos, K.; Mesropian, C.; Arora, S.; Barker, A.; Chou, J. P.; Contreras-Campana, C.; Contreras-Campana, E.; Duggan, D.; Ferencek, D.; Gershtein, Y.; Gray, R.; Halkiadakis, E.; Hidas, D.; Kaplan, S.; Lath, A.; Panwalkar, S.; Park, M.; Patel, R.; Salur, S.; Schnetzer, S.; Somalwar, S.; Stone, R.; Thomas, S.; Thomassen, P.; Walker, M.; Rose, K.; Spanier, S.; York, A.; Bouhali, O.; Castaneda Hernandez, A.; Eusebi, R.; Flanagan, W.; Gilmore, J.; Kamon, T.; Khotilovich, V.; Krutelyov, V.; Montalvo, R.; Osipenkov, I.; Pakhotin, Y.; Perloff, A.; Roe, J.; Rose, A.; Safonov, A.; Suarez, I.; Tatarinov, A.; Ulmer, K. A.; Akchurin, N.; Cowden, C.; Damgov, J.; Dragoiu, C.; Dudero, P. R.; Faulkner, J.; Kovitanggoon, K.; Kunori, S.; Lee, S. W.; Libeiro, T.; Volobouev, I.; Appelt, E.; Delannoy, A. G.; Greene, S.; Gurrola, A.; Johns, W.; Maguire, C.; Mao, Y.; Melo, A.; Sharma, M.; Sheldon, P.; Snook, B.; Tuo, S.; Velkovska, J.; Arenton, M. W.; Boutle, S.; Cox, B.; Francis, B.; Goodell, J.; Hirosky, R.; Ledovskoy, A.; Li, H.; Lin, C.; Neu, C.; Wood, J.; Clarke, C.; Harr, R.; Karchin, P. E.; Kottachchi Kankanamge Don, C.; Lamichhane, P.; Sturdy, J.; Belknap, D. A.; Carlsmith, D.; Cepeda, M.; Dasu, S.; Dodd, L.; Duric, S.; Friis, E.; Hall-Wilton, R.; Herndon, M.; Hervé, A.; Klabbers, P.; Lanaro, A.; Lazaridis, C.; Levine, A.; Loveless, R.; Mohapatra, A.; Ojalvo, I.; Perry, T.; Pierro, G. A.; Polese, G.; Ross, I.; Sarangi, T.; Savin, A.; Smith, W. H.; Taylor, D.; Vuosalo, C.; Bediaga, I.; de Miranda, J. M.; Ferreira Rodrigues, F.; Gomes, A.; Massafferri, A.; Dos Reis, A. C.; Rodrigues, A. B.; Amato, S.; Carvalho Akiba, K.; de Paula, L.; Francisco, O.; Gandelman, M.; Hicheur, A.; Lopes, J. H.; Martins Tostes, D.; Nasteva, I.; Otalora Goicochea, J. M.; Polycarpo, E.; Potterat, C.; Rangel, M. S.; Salustino Guimaraes, V.; Souza de Paula, B.; Vieira, D.; An, L.; Gao, Y.; Jing, F.; Li, Y.; Yang, Z.; Yuan, X.; Zhang, Y.; Zhong, L.; Beaucourt, L.; Chefdeville, M.; Decamp, D.; Déléage, N.; Ghez, Ph.; Lees, J.-P.; Marchand, J. F.; Minard, M.-N.; Pietrzyk, B.; Qian, W.; T'jampens, S.; Tisserand, V.; Tournefier, E.; Ajaltouni, Z.; Baalouch, M.; Cogneras, E.; Deschamps, O.; El Rifai, I.; Grabalosa Gándara, M.; Henrard, P.; Hoballah, M.; Lefèvre, R.; Maratas, J.; Monteil, S.; Niess, V.; Perret, P.; Adrover, C.; Akar, S.; Aslanides, E.; Cogan, J.; Kanso, W.; Le Gac, R.; Leroy, O.; Mancinelli, G.; Mordà, A.; Perrin-Terrin, M.; Serrano, J.; Tsaregorodtsev, A.; Amhis, Y.; Barsuk, S.; Borsato, M.; Kochebina, O.; Lefrançois, J.; Machefert, F.; Martín Sánchez, A.; Nicol, M.; Robbe, P.; Schune, M.-H.; Teklishyn, M.; Vallier, A.; Viaud, B.; Wormser, G.; Ben-Haim, E.; Charles, M.; Coquereau, S.; David, P.; Del Buono, L.; Henry, L.; Polci, F.; Albrecht, J.; Brambach, T.; Cauet, Ch.; Deckenhoff, M.; Eitschberger, U.; Ekelhof, R.; Gavardi, L.; Kruse, F.; Meier, F.; Niet, R.; Parkinson, C. J.; Schlupp, M.; Shires, A.; Spaan, B.; Swientek, S.; Wishahi, J.; Aquines Gutierrez, O.; Blouw, J.; Britsch, M.; Fontana, M.; Popov, D.; Schmelling, M.; Volyanskyy, D.; Zavertyaev, M.; Bachmann, S.; Bien, A.; Comerma-Montells, A.; de Cian, M.; Dordei, F.; Esen, S.; Färber, C.; Gersabeck, E.; Grillo, L.; Han, X.; Hansmann-Menzemer, S.; Jaeger, A.; Kolpin, M.; Kreplin, K.; Krocker, G.; Leverington, B.; Marks, J.; Meissner, M.; Neuner, M.; Nikodem, T.; Seyfert, P.; Stahl, M.; Stahl, S.; Uwer, U.; Vesterinen, M.; Wandernoth, S.; Wiedner, D.; Zhelezov, A.; McNulty, R.; Wallace, R.; Zhang, W. C.; Palano, A.; Carbone, A.; Falabella, A.; Galli, D.; Marconi, U.; Moggi, N.; Mussini, M.; Perazzini, S.; Vagnoni, V.; Valenti, G.; Zangoli, M.; Bonivento, W.; Cadeddu, S.; Cardini, A.; Cogoni, V.; Contu, A.; Lai, A.; Liu, B.; Manca, G.; Oldeman, R.; Saitta, B.; Vacca, C.; Andreotti, M.; Baldini, W.; Bozzi, C.; Calabrese, R.; Corvo, M.; Fiore, M.; Fiorini, M.; Luppi, E.; Pappalardo, L. L.; Shapoval, I.; Tellarini, G.; Tomassetti, L.; Vecchi, S.; Anderlini, L.; Bizzeti, A.; Frosini, M.; Graziani, G.; Passaleva, G.; Veltri, M.; Bencivenni, G.; Campana, P.; de Simone, P.; Lanfranchi, G.; Palutan, M.; Rama, M.; Sarti, A.; Sciascia, B.; Vazquez Gomez, R.; Cardinale, R.; Fontanelli, F.; Gambetta, S.; Patrignani, C.; Petrolini, A.; Pistone, A.; Calvi, M.; Cassina, L.; Gotti, C.; Khanji, B.; Kucharczyk, M.; Matteuzzi, C.; Fu, J.; Geraci, A.; Neri, N.; Palombo, F.; Amerio, S.; Collazuol, G.; Gallorini, S.; Gianelle, A.; Lucchesi, D.; Lupato, A.; Morandin, M.; Rotondo, M.; Sestini, L.; Simi, G.; Stroili, R.; Bedeschi, F.; Cenci, R.; Leo, S.; Marino, P.; Morello, M. J.; Punzi, G.; Stracka, S.; Walsh, J.; Carboni, G.; Furfaro, E.; Santovetti, E.; Satta, A.; Alves, A. A., Jr.; Auriemma, G.; Bocci, V.; Martellotti, G.; Penso, G.; Pinci, D.; Santacesaria, R.; Satriano, C.; Sciubba, A.; Dziurda, A.; Kucewicz, W.; Lesiak, T.; Rachwal, B.; Witek, M.; Firlej, M.; Fiutowski, T.; Idzik, M.; Morawski, P.; Moron, J.; Oblakowska-Mucha, A.; Swientek, K.; Szumlak, T.; Batozskaya, V.; Klimaszewski, K.; Kurek, K.; Szczekowski, M.; Ukleja, A.; Wislicki, W.; Cojocariu, L.; Giubega, L.; Grecu, A.; Maciuc, F.; Orlandea, M.; Popovici, B.; Stoica, S.; Straticiuc, M.; Alkhazov, G.; Bondar, N.; Dzyuba, A.; Maev, O.; Sagidova, N.; Shcheglov, Y.; Vorobyev, A.; Belogurov, S.; Belyaev, I.; Egorychev, V.; Golubkov, D.; Kvaratskheliya, T.; Machikhiliyan, I. V.; Polyakov, I.; Savrina, D.; Semennikov, A.; Zhokhov, A.; Berezhnoy, A.; Korolev, M.; Leflat, A.; Nikitin, N.; Filippov, S.; Gushchin, E.; Kravchuk, L.; Bondar, A.; Eidelman, S.; Krokovny, P.; Kudryavtsev, V.; Shekhtman, L.; Vorobyev, V.; Artamonov, A.; Belous, K.; Dzhelyadin, R.; Guz, Yu.; Novoselov, A.; Obraztsov, V.; Popov, A.; Romanovsky, V.; Shapkin, M.; Stenyakin, O.; Yushchenko, O.; Badalov, A.; Calvo Gomez, M.; Garrido, L.; Gascon, D.; Graciani Diaz, R.; Graugés, E.; Marin Benito, C.; Picatoste Olloqui, E.; Rives Molina, V.; Ruiz, H.; Vilasis-Cardona, X.; Adeva, B.; Alvarez Cartelle, P.; Dosil Suárez, A.; Fernandez Albor, V.; Gallas Torreira, A.; García Pardiñas, J.; Hernando Morata, J. A.; Plo Casasus, M.; Romero Vidal, A.; Saborido Silva, J. J.; Sanmartin Sedes, B.; Santamarina Rios, C.; Vazquez Regueiro, P.; Vázquez Sierra, C.; Vieites Diaz, M.; Alessio, F.; Archilli, F.; Barschel, C.; Benson, S.; Buytaert, J.; Campora Perez, D.; Castillo Garcia, L.; Cattaneo, M.; Charpentier, Ph.; Cid Vidal, X.; Clemencic, M.; Closier, J.; Coco, V.; Collins, P.; Corti, G.; Couturier, B.; D'Ambrosio, C.; Dettori, F.; di Canto, A.; Dijkstra, H.; Durante, P.; Ferro-Luzzi, M.; Forty, R.; Frank, M.; Frei, C.; Gaspar, C.; Gligorov, V. V.; Granado Cardoso, L. A.; Gys, T.; Haen, C.; He, J.; Head, T.; van Herwijnen, E.; Jacobsson, R.; Johnson, D.; Joram, C.; Jost, B.; Karacson, M.; Karbach, T. M.; Lacarrere, D.; Langhans, B.; Lindner, R.; Linn, C.; Lohn, S.; Mapelli, A.; Matev, R.; Mathe, Z.; Neubert, S.; Neufeld, N.; Otto, A.; Panman, J.; Pepe Altarelli, M.; Rauschmayr, N.; Rihl, M.; Roiser, S.; Ruf, T.; Schindler, H.; Schmidt, B.; Schopper, A.; Schwemmer, R.; Sridharan, S.; Stagni, F.; Subbiah, V. K.; Teubert, F.; Thomas, E.; Tonelli, D.; Trisovic, A.; Ubeda Garcia, M.; Wicht, J.; Wyllie, K.; Battista, V.; Bay, A.; Blanc, F.; Dorigo, M.; Dupertuis, F.; Fitzpatrick, C.; Gianì, S.; Haefeli, G.; Jaton, P.; Khurewathanakul, C.; Komarov, I.; La Thi, V. N.; Lopez-March, N.; Märki, R.; Martinelli, M.; Muster, B.; Nakada, T.; Nguyen, A. D.; Nguyen, T. D.; Nguyen-Mau, C.; Prisciandaro, J.; Puig Navarro, A.; Rakotomiaramanana, B.; Rouvinet, J.; Schneider, O.; Soomro, F.; Szczypka, P.; Tobin, M.; Tourneur, S.; Tran, M. T.; Veneziano, G.; Xu, Z.; Anderson, J.; Bernet, R.; Bowen, E.; Bursche, A.; Chiapolini, N.; Chrzaszcz, M.; Elsasser, Ch.; Graverini, E.; Lionetto, F.; Lowdon, P.; Müller, K.; Serra, N.; Steinkamp, O.; Storaci, B.; Straumann, U.; Tresch, M.; Vollhardt, A.; Aaij, R.; Ali, S.; van Beuzekom, M.; David, P. N. Y.; de Bruyn, K.; Farinelli, C.; Heijne, V.; Hulsbergen, W.; Jans, E.; Koppenburg, P.; Kozlinskiy, A.; van Leerdam, J.; Merk, M.; Oggero, S.; Pellegrino, A.; Snoek, H.; van Tilburg, J.; Tsopelas, P.; Tuning, N.; de Vries, J. A.; Ketel, T.; Koopman, R. F.; Lambert, R. W.; Martinez Santos, D.; Raven, G.; Schiller, M.; Syropoulos, V.; Tolk, S.; Dovbnya, A.; Kandybei, S.; Raniuk, I.; Okhrimenko, O.; Pugatch, V.; Bifani, S.; Farley, N.; Griffith, P.; Kenyon, I. R.; Lazzeroni, C.; Mazurov, A.; McCarthy, J.; Pescatore, L.; Watson, N. K.; Williams, M. P.; Adinolfi, M.; Benton, J.; Brook, N. H.; Cook, A.; Coombes, M.; Dalseno, J.; Hampson, T.; Harnew, S. T.; Naik, P.; Price, E.; Prouve, C.; Rademacker, J. H.; Richards, S.; Saunders, D. M.; Skidmore, N.; Souza, D.; Velthuis, J. J.; Voong, D.; Barter, W.; Bettler, M.-O.; Cliff, H. V.; Evans, H.-M.; Garra Tico, J.; Gibson, V.; Gregson, S.; Haines, S. C.; Jones, C. R.; Sirendi, M.; Smith, J.; Ward, D. R.; Wotton, S. A.; Wright, S.; Back, J. J.; Blake, T.; Craik, D. C.; Crocombe, A. C.; Dossett, D.; Gershon, T.; Kreps, M.; Langenbruch, C.; Latham, T.; O'Hanlon, D. P.; Pilař, T.; Poluektov, A.; Reid, M. M.; Silva Coutinho, R.; Wallace, C.; Whitehead, M.; Easo, S.; Nandakumar, R.; Papanestis, A.; Ricciardi, S.; Wilson, F. F.; Carson, L.; Clarke, P. E. L.; Cowan, G. A.; Eisenhardt, S.; Ferguson, D.; Lambert, D.; Luo, H.; Morris, A.-B.; Muheim, F.; Needham, M.; Playfer, S.; Alexander, M.; Beddow, J.; Dean, C.-T.; Eklund, L.; Hynds, D.; Karodia, S.; Longstaff, I.; Ogilvy, S.; Pappagallo, M.; Sail, P.; Skillicorn, I.; Soler, F. J. P.; Spradlin, P.; Affolder, A.; Bowcock, T. J. V.; Brown, H.; Casse, G.; Donleavy, S.; Dreimanis, K.; Farry, S.; Fay, R.; Hennessy, K.; Hutchcroft, D.; Liles, M.; McSkelly, B.; Patel, G. D.; Price, J. D.; Pritchard, A.; Rinnert, K.; Shears, T.; Smith, N. A.; Ciezarek, G.; Cunliffe, S.; Currie, R.; Egede, U.; Fol, P.; Golutvin, A.; Hall, S.; McCann, M.; Owen, P.; Patel, M.; Petridis, K.; Redi, F.; Sepp, I.; Smith, E.; Sutcliffe, W.; Websdale, D.; Appleby, R. B.; Barlow, R. J.; Bird, T.; Bjørnstad, P. M.; Borghi, S.; Brett, D.; Brodzicka, J.; Capriotti, L.; Chen, S.; de Capua, S.; Dujany, G.; Gersabeck, M.; Harrison, J.; Hombach, C.; Klaver, S.; Lafferty, G.; McNab, A.; Parkes, C.; Pearce, A.; Reichert, S.; Rodrigues, E.; Rodriguez Perez, P.; Smith, M.; Cheung, S.-F.; Derkach, D.; Evans, T.; Gauld, R.; Greening, E.; Harnew, N.; Hill, D.; Hunt, P.; Hussain, N.; Jalocha, J.; John, M.; Lupton, O.; Malde, S.; Smith, E.; Stevenson, S.; Thomas, C.; Topp-Joergensen, S.; Torr, N.; Wilkinson, G.; Counts, I.; Ilten, P.; Williams, M.; Andreassen, R.; Davis, A.; de Silva, W.; Meadows, B.; Sokoloff, M. D.; Sun, L.; Todd, J.; Andrews, J. E.; Hamilton, B.; Jawahery, A.; Wimberley, J.; Artuso, M.; Blusk, S.; Borgia, A.; Britton, T.; Ely, S.; Gandini, P.; Garofoli, J.; Gui, B.; Hadjivasiliou, C.; Jurik, N.; Kelsey, M.; Mountain, R.; Pal, B. K.; Skwarnicki, T.; Stone, S.; Wang, J.; Xing, Z.; Zhang, L.; Baesso, C.; Cruz Torres, M.; Göbel, C.; Molina Rodriguez, J.; Xie, Y.; Milanes, D. A.; Grünberg, O.; Heß, M.; Voß, C.; Waldi, R.; Likhomanenko, T.; Malinin, A.; Shevchenko, V.; Ustyuzhanin, A.; Martinez Vidal, F.; Oyanguren, A.; Ruiz Valls, P.; Sanchez Mayordomo, C.; Onderwater, C. J. G.; Wilschut, H. W.; Pesen, E.
2015-06-01
The standard model of particle physics describes the fundamental particles and their interactions via the strong, electromagnetic and weak forces. It provides precise predictions for measurable quantities that can be tested experimentally. The probabilities, or branching fractions, of the strange B meson () and the B0 meson decaying into two oppositely charged muons (μ+ and μ-) are especially interesting because of their sensitivity to theories that extend the standard model. The standard model predicts that the and decays are very rare, with about four of the former occurring for every billion mesons produced, and one of the latter occurring for every ten billion B0 mesons. A difference in the observed branching fractions with respect to the predictions of the standard model would provide a direction in which the standard model should be extended. Before the Large Hadron Collider (LHC) at CERN started operating, no evidence for either decay mode had been found. Upper limits on the branching fractions were an order of magnitude above the standard model predictions. The CMS (Compact Muon Solenoid) and LHCb (Large Hadron Collider beauty) collaborations have performed a joint analysis of the data from proton-proton collisions that they collected in 2011 at a centre-of-mass energy of seven teraelectronvolts and in 2012 at eight teraelectronvolts. Here we report the first observation of the µ+µ- decay, with a statistical significance exceeding six standard deviations, and the best measurement so far of its branching fraction. Furthermore, we obtained evidence for the µ+µ- decay with a statistical significance of three standard deviations. Both measurements are statistically compatible with standard model predictions and allow stringent constraints to be placed on theories beyond the standard model. The LHC experiments will resume taking data in 2015, recording proton-proton collisions at a centre-of-mass energy of 13 teraelectronvolts, which will approximately double the production rates of and B0 mesons and lead to further improvements in the precision of these crucial tests of the standard model.
20 CFR 634.4 - Statistical standards.
Code of Federal Regulations, 2011 CFR
2011-04-01
... 20 Employees' Benefits 3 2011-04-01 2011-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs. ...
20 CFR 634.4 - Statistical standards.
Code of Federal Regulations, 2010 CFR
2010-04-01
... 20 Employees' Benefits 3 2010-04-01 2010-04-01 false Statistical standards. 634.4 Section 634.4... System § 634.4 Statistical standards. Recipients shall agree to provide required data following the statistical standards prescribed by the Bureau of Labor Statistics for cooperative statistical programs. ...
ERIC Educational Resources Information Center
Schiel, Jeff L.; King, Jason E.
Analyses of data from operational course placement systems are subject to the effects of truncation; students with low placement test scores may enroll in a remedial course, rather than a standard-level course, and therefore will not have outcome data from the standard course. In "soft" truncation, some (but not all) students who score…
ERIC Educational Resources Information Center
Stoneberg, Bert D.
2015-01-01
The National Center of Education Statistics conducted a mapping study that equated the percentage proficient or above on each state's NCLB reading and mathematics tests in grades 4 and 8 to the NAEP scale. Each "NAEP equivalent score" was labeled according to NAEP's achievement levels and used to compare state proficiency standards and…
Quantitative Imaging Biomarkers: A Review of Statistical Methods for Computer Algorithm Comparisons
2014-01-01
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. PMID:24919829
Code of Federal Regulations, 2011 CFR
2011-01-01
.... agricultural and rural economy. (2) Administering a methodological research program to improve agricultural... design and data collection methodologies to the agricultural statistics program. Major functions include...) Designing, testing, and establishing survey techniques and standards, including sample design, sample...
Code of Federal Regulations, 2010 CFR
2010-01-01
.... agricultural and rural economy. (2) Administering a methodological research program to improve agricultural... design and data collection methodologies to the agricultural statistics program. Major functions include...) Designing, testing, and establishing survey techniques and standards, including sample design, sample...
Code of Federal Regulations, 2012 CFR
2012-01-01
.... agricultural and rural economy. (2) Administering a methodological research program to improve agricultural... design and data collection methodologies to the agricultural statistics program. Major functions include...) Designing, testing, and establishing survey techniques and standards, including sample design, sample...
Code of Federal Regulations, 2013 CFR
2013-01-01
.... agricultural and rural economy. (2) Administering a methodological research program to improve agricultural... design and data collection methodologies to the agricultural statistics program. Major functions include...) Designing, testing, and establishing survey techniques and standards, including sample design, sample...
Code of Federal Regulations, 2014 CFR
2014-01-01
.... agricultural and rural economy. (2) Administering a methodological research program to improve agricultural... design and data collection methodologies to the agricultural statistics program. Major functions include...) Designing, testing, and establishing survey techniques and standards, including sample design, sample...
Score tests for independence in semiparametric competing risks models.
Saïd, Mériem; Ghazzali, Nadia; Rivest, Louis-Paul
2009-12-01
A popular model for competing risks postulates the existence of a latent unobserved failure time for each risk. Assuming that these underlying failure times are independent is attractive since it allows standard statistical tools for right-censored lifetime data to be used in the analysis. This paper proposes simple independence score tests for the validity of this assumption when the individual risks are modeled using semiparametric proportional hazards regressions. It assumes that covariates are available, making the model identifiable. The score tests are derived for alternatives that specify that copulas are responsible for a possible dependency between the competing risks. The test statistics are constructed by adding to the partial likelihoods for the individual risks an explanatory variable for the dependency between the risks. A variance estimator is derived by writing the score function and the Fisher information matrix for the marginal models as stochastic integrals. Pitman efficiencies are used to compare test statistics. A simulation study and a numerical example illustrate the methodology proposed in this paper.
[Overview and prospect of syndrome differentiation of hypertension in traditional Chinese medicine].
Yang, Xiao-Chen; Xiong, Xing-Jiang; Wang, Jie
2014-01-01
This article is to overview the literature of syndrome differentiation of traditional Chinese medicine on hypertension. According to the theory of disease in combination with syndrome, we concluded syndrome types of hypertension in four aspects, including national standards, industry standards, teaching standards and personal experience. Meanwhile, in order to provide new methods and approaches for normalized research, we integrated modern testing methods and statistical methods to analyze syndrome differentiation for the treatment of hypertension.
Shilts, Mical Kay; Lamp, Cathi; Horowitz, Marcel; Townsend, Marilyn S
2009-01-01
Investigate the impact of a nutrition education program on student academic performance as measured by achievement of education standards. Quasi-experimental crossover-controlled study. California Central Valley suburban elementary school (58% qualified for free or reduced-priced lunch). All sixth-grade students (n = 84) in the elementary school clustered in 3 classrooms. 9-lesson intervention with an emphasis on guided goal setting and driven by the Social Cognitive Theory. Multiple-choice survey assessing 5 education standards for sixth-grade mathematics and English at 3 time points: baseline (T1), 5 weeks (T2), and 10 weeks (T3). Repeated measures, paired t test, and analysis of covariance. Changes in total scores were statistically different (P < .05), with treatment scores (T3 - T2) generating more gains. The change scores for 1 English (P < .01) and 2 mathematics standards (P < .05; P < .001) were statistically greater for the treatment period (T3 - T2) compared to the control period (T2 - T1). Using standardized tests, results of this pilot study suggest that EatFit can improve academic performance measured by achievement of specific mathematics and English education standards. Nutrition educators can show school administrators and wellness committee members that this program can positively impact academic performance, concomitant to its primary objective of promoting healthful eating and physical activity.
Negeri, Zelalem F; Shaikh, Mateen; Beyene, Joseph
2018-05-11
Diagnostic or screening tests are widely used in medical fields to classify patients according to their disease status. Several statistical models for meta-analysis of diagnostic test accuracy studies have been developed to synthesize test sensitivity and specificity of a diagnostic test of interest. Because of the correlation between test sensitivity and specificity, modeling the two measures using a bivariate model is recommended. In this paper, we extend the current standard bivariate linear mixed model (LMM) by proposing two variance-stabilizing transformations: the arcsine square root and the Freeman-Tukey double arcsine transformation. We compared the performance of the proposed methods with the standard method through simulations using several performance measures. The simulation results showed that our proposed methods performed better than the standard LMM in terms of bias, root mean square error, and coverage probability in most of the scenarios, even when data were generated assuming the standard LMM. We also illustrated the methods using two real data sets. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Statistical testing of association between menstruation and migraine.
Barra, Mathias; Dahl, Fredrik A; Vetvik, Kjersti G
2015-02-01
To repair and refine a previously proposed method for statistical analysis of association between migraine and menstruation. Menstrually related migraine (MRM) affects about 20% of female migraineurs in the general population. The exact pathophysiological link from menstruation to migraine is hypothesized to be through fluctuations in female reproductive hormones, but the exact mechanisms remain unknown. Therefore, the main diagnostic criterion today is concurrency of migraine attacks with menstruation. Methods aiming to exclude spurious associations are wanted, so that further research into these mechanisms can be performed on a population with a true association. The statistical method is based on a simple two-parameter null model of MRM (which allows for simulation modeling), and Fisher's exact test (with mid-p correction) applied to standard 2 × 2 contingency tables derived from the patients' headache diaries. Our method is a corrected version of a previously published flawed framework. To our best knowledge, no other published methods for establishing a menstruation-migraine association by statistical means exist today. The probabilistic methodology shows good performance when subjected to receiver operator characteristic curve analysis. Quick reference cutoff values for the clinical setting were tabulated for assessing association given a patient's headache history. In this paper, we correct a proposed method for establishing association between menstruation and migraine by statistical methods. We conclude that the proposed standard of 3-cycle observations prior to setting an MRM diagnosis should be extended with at least one perimenstrual window to obtain sufficient information for statistical processing. © 2014 American Headache Society.
Laurin, E; Thakur, K K; Gardner, I A; Hick, P; Moody, N J G; Crane, M S J; Ernst, I
2018-05-01
Design and reporting quality of diagnostic accuracy studies (DAS) are important metrics for assessing utility of tests used in animal and human health. Following standards for designing DAS will assist in appropriate test selection for specific testing purposes and minimize the risk of reporting biased sensitivity and specificity estimates. To examine the benefits of recommending standards, design information from published DAS literature was assessed for 10 finfish, seven mollusc, nine crustacean and two amphibian diseases listed in the 2017 OIE Manual of Diagnostic Tests for Aquatic Animals. Of the 56 DAS identified, 41 were based on field testing, eight on experimental challenge studies and seven on both. Also, we adapted human and terrestrial-animal standards and guidelines for DAS structure for use in aquatic animal diagnostic research. Through this process, we identified and addressed important metrics for consideration at the design phase: study purpose, targeted disease state, selection of appropriate samples and specimens, laboratory analytical methods, statistical methods and data interpretation. These recommended design standards for DAS are presented as a checklist including risk-of-failure points and actions to mitigate bias at each critical step. Adherence to standards when designing DAS will also facilitate future systematic review and meta-analyses of DAS research literature. © 2018 John Wiley & Sons Ltd.
Kuretzki, Carlos Henrique; Campos, Antônio Carlos Ligocki; Malafaia, Osvaldo; Soares, Sandramara Scandelari Kusano de Paula; Tenório, Sérgio Bernardo; Timi, Jorge Rufino Ribas
2016-03-01
The use of information technology is often applied in healthcare. With regard to scientific research, the SINPE(c) - Integrated Electronic Protocols was created as a tool to support researchers, offering clinical data standardization. By the time, SINPE(c) lacked statistical tests obtained by automatic analysis. Add to SINPE(c) features for automatic realization of the main statistical methods used in medicine . The study was divided into four topics: check the interest of users towards the implementation of the tests; search the frequency of their use in health care; carry out the implementation; and validate the results with researchers and their protocols. It was applied in a group of users of this software in their thesis in the strict sensu master and doctorate degrees in one postgraduate program in surgery. To assess the reliability of the statistics was compared the data obtained both automatically by SINPE(c) as manually held by a professional in statistics with experience with this type of study. There was concern for the use of automatic statistical tests, with good acceptance. The chi-square, Mann-Whitney, Fisher and t-Student were considered as tests frequently used by participants in medical studies. These methods have been implemented and thereafter approved as expected. The incorporation of the automatic SINPE (c) Statistical Analysis was shown to be reliable and equal to the manually done, validating its use as a research tool for medical research.
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Standards for reporting fish toxicity tests
Cope, O.B.
1961-01-01
The growing impetus of studies on fish and pesticides focuses attention on the need for standardized reporting procedures. Good methods have been developed for laboratory and field procedures in testing programs and in statistical features of assay experiments; and improvements are being made on methods of collecting and preserving fish, invertebrates, and other materials exposed to economic poisons. On the other had, the reporting of toxicity data in a complete manner has lagged behind, and today's literature is little improved over yesterday's with regard to completeness and susceptibility to interpretation.
Analysis of Statistical Methods Currently used in Toxicology Journals
Na, Jihye; Yang, Hyeri
2014-01-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health. PMID:25343012
Analysis of Statistical Methods Currently used in Toxicology Journals.
Na, Jihye; Yang, Hyeri; Bae, SeungJin; Lim, Kyung-Min
2014-09-01
Statistical methods are frequently used in toxicology, yet it is not clear whether the methods employed by the studies are used consistently and conducted based on sound statistical grounds. The purpose of this paper is to describe statistical methods used in top toxicology journals. More specifically, we sampled 30 papers published in 2014 from Toxicology and Applied Pharmacology, Archives of Toxicology, and Toxicological Science and described methodologies used to provide descriptive and inferential statistics. One hundred thirteen endpoints were observed in those 30 papers, and most studies had sample size less than 10, with the median and the mode being 6 and 3 & 6, respectively. Mean (105/113, 93%) was dominantly used to measure central tendency, and standard error of the mean (64/113, 57%) and standard deviation (39/113, 34%) were used to measure dispersion, while few studies provide justifications regarding why the methods being selected. Inferential statistics were frequently conducted (93/113, 82%), with one-way ANOVA being most popular (52/93, 56%), yet few studies conducted either normality or equal variance test. These results suggest that more consistent and appropriate use of statistical method is necessary which may enhance the role of toxicology in public health.
Development and Validation of Instruments to Measure Learning of Expert-Like Thinking
NASA Astrophysics Data System (ADS)
Adams, Wendy K.; Wieman, Carl E.
2011-06-01
This paper describes the process for creating and validating an assessment test that measures the effectiveness of instruction by probing how well that instruction causes students in a class to think like experts about specific areas of science. The design principles and process are laid out and it is shown how these align with professional standards that have been established for educational and psychological testing and the elements of assessment called for in a recent National Research Council study on assessment. The importance of student interviews for creating and validating the test is emphasized, and the appropriate interview procedures are presented. The relevance and use of standard psychometric statistical tests are discussed. Additionally, techniques for effective test administration are presented.
NASA Technical Reports Server (NTRS)
Hughitt, Brian; Generazio, Edward (Principal Investigator); Nichols, Charles; Myers, Mika (Principal Investigator); Spencer, Floyd (Principal Investigator); Waller, Jess (Principal Investigator); Wladyka, Jordan (Principal Investigator); Aldrin, John; Burke, Eric; Cerecerez, Laura;
2016-01-01
NASA-STD-5009 requires that successful flaw detection by NDE methods be statistically qualified for use on fracture critical metallic components, but does not standardize practices. This task works towards standardizing calculations and record retention with a web-based tool, the NNWG POD Standards Library or NPSL. Test methods will also be standardized with an appropriately flexible appendix to -5009 identifying best practices. Additionally, this appendix will describe how specimens used to qualify NDE systems will be cataloged, stored and protected from corrosion, damage, or loss.
Design and Test of Pseudorandom Number Generator Using a Star Network of Lorenz Oscillators
NASA Astrophysics Data System (ADS)
Cho, Kenichiro; Miyano, Takaya
We have recently developed a chaos-based stream cipher based on augmented Lorenz equations as a star network of Lorenz subsystems. In our method, the augmented Lorenz equations are used as a pseudorandom number generator. In this study, we propose a new method based on the augmented Lorenz equations for generating binary pseudorandom numbers and evaluate its security using the statistical tests of SP800-22 published by the National Institute for Standards and Technology in comparison with the performances of other chaotic dynamical models used as binary pseudorandom number generators. We further propose a faster version of the proposed method and evaluate its security using the statistical tests of TestU01 published by L’Ecuyer and Simard.
Tests of Mediation: Paradoxical Decline in Statistical Power as a Function of Mediator Collinearity
Beasley, T. Mark
2013-01-01
Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation due to increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed. PMID:24954952
Austin, Peter C; Goldwasser, Meredith A
2008-03-01
We examined the impact on statistical inference when a chi(2) test is used to compare the proportion of successes in the level of a categorical variable that has the highest observed proportion of successes with the proportion of successes in all other levels of the categorical variable combined. Monte Carlo simulations and a case study examining the association between astrological sign and hospitalization for heart failure. A standard chi(2) test results in an inflation of the type I error rate, with the type I error rate increasing as the number of levels of the categorical variable increases. Using a standard chi(2) test, the hospitalization rate for Pisces was statistically significantly different from that of the other 11 astrological signs combined (P=0.026). After accounting for the fact that the selection of Pisces was based on it having the highest observed proportion of heart failure hospitalizations, subjects born under the sign of Pisces no longer had a significantly higher rate of heart failure hospitalization compared to the other residents of Ontario (P=0.152). Post hoc comparisons of the proportions of successes across different levels of a categorical variable can result in incorrect inferences.
NASA Astrophysics Data System (ADS)
Reynolds, J. G.; Sandstrom, M. M.; Brown, G. W.; Warner, K. F.; Phillips, J. J.; Shelley, T. J.; Reyes, J. A.; Hsu, P. C.
2014-05-01
One of the first steps in establishing safe handling procedures for explosives is small-scale safety and thermal (SSST) testing. To better understand the response of improvised materials or homemade explosives (HMEs) to SSST testing, 16 HME materials were compared to three standard military explosives in a proficiency-type round robin study among five laboratories-two DoD and three DOE-sponsored by DHS. The testing matrix has been designed to address problems encountered with improvised materials-powder mixtures, liquid suspensions, partially wetted solids, immiscible liquids, and reactive materials. More than 30 issues have been identified that indicate standard test methods may require modification when applied to HMEs to derive accurate sensitivity assessments needed for developing safe handling and storage practices. This paper presents a generalized comparison of the results among the testing participants, comparison of friction results from BAM (German Bundesanstalt für Materi-alprüfung) and ABL (Allegany Ballistics Laboratory) designed testing equipment, and an overview of the statistical results from the RDX (1,3,5-Trinitroperhydro-1,3,5-triazine) standard tested throughout the proficiency test.
NASA Technical Reports Server (NTRS)
Nguyen, Truong X.; Ely, Jay J.; Koppen, Sandra V.
2001-01-01
This paper describes the implementation of mode-stirred method for susceptibility testing according to the current DO-160D standard. Test results on an Engine Data Processor using the implemented procedure and the comparisons with the standard anechoic test results are presented. The comparison experimentally shows that the susceptibility thresholds found in mode-stirred method are consistently higher than anechoic. This is consistent with the recent statistical analysis finding by NIST that the current calibration procedure overstates field strength by a fixed amount. Once the test results are adjusted for this value, the comparisons with the anechoic results are excellent. The results also show that test method has excellent chamber to chamber repeatability. Several areas for improvements to the current procedure are also identified and implemented.
Statistical tests for power-law cross-correlated processes
NASA Astrophysics Data System (ADS)
Podobnik, Boris; Jiang, Zhi-Qiang; Zhou, Wei-Xing; Stanley, H. Eugene
2011-12-01
For stationary time series, the cross-covariance and the cross-correlation as functions of time lag n serve to quantify the similarity of two time series. The latter measure is also used to assess whether the cross-correlations are statistically significant. For nonstationary time series, the analogous measures are detrended cross-correlations analysis (DCCA) and the recently proposed detrended cross-correlation coefficient, ρDCCA(T,n), where T is the total length of the time series and n the window size. For ρDCCA(T,n), we numerically calculated the Cauchy inequality -1≤ρDCCA(T,n)≤1. Here we derive -1≤ρDCCA(T,n)≤1 for a standard variance-covariance approach and for a detrending approach. For overlapping windows, we find the range of ρDCCA within which the cross-correlations become statistically significant. For overlapping windows we numerically determine—and for nonoverlapping windows we derive—that the standard deviation of ρDCCA(T,n) tends with increasing T to 1/T. Using ρDCCA(T,n) we show that the Chinese financial market's tendency to follow the U.S. market is extremely weak. We also propose an additional statistical test that can be used to quantify the existence of cross-correlations between two power-law correlated time series.
NASA Astrophysics Data System (ADS)
Adams, T.; Batra, P.; Bugel, L.; Camilleri, L.; Conrad, J. M.; de Gouvêa, A.; Fisher, P. H.; Formaggio, J. A.; Jenkins, J.; Karagiorgi, G.; Kobilarcik, T. R.; Kopp, S.; Kyle, G.; Loinaz, W. A.; Mason, D. A.; Milner, R.; Moore, R.; Morfín, J. G.; Nakamura, M.; Naples, D.; Nienaber, P.; Olness, F. I.; Owens, J. F.; Pate, S. F.; Pronin, A.; Seligman, W. G.; Shaevitz, M. H.; Schellman, H.; Schienbein, I.; Syphers, M. J.; Tait, T. M. P.; Takeuchi, T.; Tan, C. Y.; van de Water, R. G.; Yamamoto, R. K.; Yu, J. Y.
We extend the physics case for a new high-energy, ultra-high statistics neutrino scattering experiment, NuSOnG (Neutrino Scattering On Glass) to address a variety of issues including precision QCD measurements, extraction of structure functions, and the derived Parton Distribution Functions (PDF's). This experiment uses a Tevatron-based neutrino beam to obtain a sample of Deep Inelastic Scattering (DIS) events which is over two orders of magnitude larger than past samples. We outline an innovative method for fitting the structure functions using a parametrized energy shift which yields reduced systematic uncertainties. High statistics measurements, in combination with improved systematics, will enable NuSOnG to perform discerning tests of fundamental Standard Model parameters as we search for deviations which may hint of "Beyond the Standard Model" physics.
NCES Finds States Lowered "Proficiency" Bar
ERIC Educational Resources Information Center
Viadero, Debra
2009-01-01
With 2014 approaching as the deadline by which states must get all their students up to "proficient" levels on state tests, a study released last week by the U.S. Department of Education's top statistics agency suggests that some states may have lowered student-proficiency standards on such tests in recent years. For the 47-state study,…
40 CFR 1048.510 - What transient duty cycles apply for laboratory testing?
Code of Federal Regulations, 2013 CFR
2013-07-01
... model year, measure emissions by testing the engine on a dynamometer with the duty cycle described in Appendix II to determine whether it meets the transient emission standards in § 1048.101(a). (b) Calculate cycle statistics and compare with the established criteria as specified in 40 CFR 1065.514 to confirm...
40 CFR 1048.510 - What transient duty cycles apply for laboratory testing?
Code of Federal Regulations, 2011 CFR
2011-07-01
... model year, measure emissions by testing the engine on a dynamometer with the duty cycle described in Appendix II to determine whether it meets the transient emission standards in § 1048.101(a). (b) Calculate cycle statistics and compare with the established criteria as specified in 40 CFR 1065.514 to confirm...
40 CFR 1048.510 - What transient duty cycles apply for laboratory testing?
Code of Federal Regulations, 2014 CFR
2014-07-01
... model year, measure emissions by testing the engine on a dynamometer with the duty cycle described in Appendix II to determine whether it meets the transient emission standards in § 1048.101(a). (b) Calculate cycle statistics and compare with the established criteria as specified in 40 CFR 1065.514 to confirm...
40 CFR 1048.510 - What transient duty cycles apply for laboratory testing?
Code of Federal Regulations, 2012 CFR
2012-07-01
... model year, measure emissions by testing the engine on a dynamometer with the duty cycle described in Appendix II to determine whether it meets the transient emission standards in § 1048.101(a). (b) Calculate cycle statistics and compare with the established criteria as specified in 40 CFR 1065.514 to confirm...
The Performance of Methods to Test Upper-Level Mediation in the Presence of Nonnormal Data
ERIC Educational Resources Information Center
Pituch, Keenan A.; Stapleton, Laura M.
2008-01-01
A Monte Carlo study compared the statistical performance of standard and robust multilevel mediation analysis methods to test indirect effects for a cluster randomized experimental design under various departures from normality. The performance of these methods was examined for an upper-level mediation process, where the indirect effect is a fixed…
For Tests That Are Predictively Powerful and without Social Prejudice
ERIC Educational Resources Information Center
Soares, Joseph A.
2012-01-01
In Philip Pullman's dark matter sci-fi trilogy, there is a golden compass that in the hands of the right person is predictively powerful; the same was supposed to be true of the SAT/ACT--the statistically indistinguishable standardized tests for college admissions. They were intended to be reliable mechanisms for identifying future trajectories,…
ERIC Educational Resources Information Center
DiLuzio, Geneva J.; And Others
This document accompanies Conceptual Learning and Development Assessment Series II: Cutting Tool, a test constructed to chart the conceptual development of individuals. As a technical manual, it contains information on the rationale, development, standardization, and reliability of the test, as well as essential information and statistical data…
ERIC Educational Resources Information Center
DiLuzio, Geneva J.; And Others
This document accompanies the Conceptual Learning and Development Assessment Series III: Tree, a test constructed to chart the conceptual development of individuals. As a technical manual, it contains information on the rationale, development, standardization, and reliability of the test, as well as essential information and statistical data for…
ERIC Educational Resources Information Center
DiLuzio, Geneva J.; And Others
This document accompanies the Conceptual Learning and Development Assessment Series IV: Noun, a test constructed to chart the conceptual development of individuals. As a technical manual, it contains information on the rationale, development, standardization, and reliability of the test, as well as essential information and statistical data for…
Statistical inference and Aristotle's Rhetoric.
Macdonald, Ranald R
2004-11-01
Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
Saliba, Georges; Saleh, Rawad; Zhao, Yunliang; Presto, Albert A; Lambe, Andrew T; Frodin, Bruce; Sardar, Satya; Maldonado, Hector; Maddox, Christine; May, Andrew A; Drozd, Greg T; Goldstein, Allen H; Russell, Lynn M; Hagen, Fabian; Robinson, Allen L
2017-06-06
Recent increases in the Corporate Average Fuel Economy standards have led to widespread adoption of vehicles equipped with gasoline direct-injection (GDI) engines. Changes in engine technologies can alter emissions. To quantify these effects, we measured gas- and particle-phase emissions from 82 light-duty gasoline vehicles recruited from the California in-use fleet tested on a chassis dynamometer using the cold-start unified cycle. The fleet included 15 GDI vehicles, including 8 GDIs certified to the most-stringent emissions standard, superultra-low-emission vehicles (SULEV). We quantified the effects of engine technology, emission certification standards, and cold-start on emissions. For vehicles certified to the same emissions standard, there is no statistical difference of regulated gas-phase pollutant emissions between PFIs and GDIs. However, GDIs had, on average, a factor of 2 higher particulate matter (PM) mass emissions than PFIs due to higher elemental carbon (EC) emissions. SULEV certified GDIs have a factor of 2 lower PM mass emissions than GDIs certified as ultralow-emission vehicles (3.0 ± 1.1 versus 6.3 ± 1.1 mg/mi), suggesting improvements in engine design and calibration. Comprehensive organic speciation revealed no statistically significant differences in the composition of the volatile organic compounds emissions between PFI and GDIs, including benzene, toluene, ethylbenzene, and xylenes (BTEX). Therefore, the secondary organic aerosol and ozone formation potential of the exhaust does not depend on engine technology. Cold-start contributes a larger fraction of the total unified cycle emissions for vehicles meeting more-stringent emission standards. Organic gas emissions were the most sensitive to cold-start compared to the other pollutants tested here. There were no statistically significant differences in the effects of cold-start on GDIs and PFIs. For our test fleet, the measured 14.5% decrease in CO 2 emissions from GDIs was much greater than the potential climate forcing associated with higher black carbon emissions. Thus, switching from PFI to GDI vehicles will likely lead to a reduction in net global warming.
Study of statistical coding for digital TV
NASA Technical Reports Server (NTRS)
Gardenhire, L. W.
1972-01-01
The results are presented for a detailed study to determine a pseudo-optimum statistical code to be installed in a digital TV demonstration test set. Studies of source encoding were undertaken, using redundancy removal techniques in which the picture is reproduced within a preset tolerance. A method of source encoding, which preliminary studies show to be encouraging, is statistical encoding. A pseudo-optimum code was defined and the associated performance of the code was determined. The format was fixed at 525 lines per frame, 30 frames per second, as per commercial standards.
Placek, Sarah B; Franklin, Brenton R; Haviland, Sarah M; Wagner, Mercy D; O'Donnell, Mary T; Cryer, Chad T; Trinca, Kristen D; Silverman, Elliott; Matthew Ritter, E
2017-06-01
Using previously established mastery learning standards, this study compares outcomes of training on standard FLS (FLS) equipment with training on an ergonomically different (ED-FLS), but more portable, lower cost platform. Subjects completed a pre-training FLS skills test on the standard platform and were then randomized to train on the FLS training platform (n = 20) or the ED-FLS platform (n = 19). A post-training FLS skills test was administered to both groups on the standard FLS platform. Group performance on the pretest was similar. Fifty percent of FLS and 32 % of ED-FLS subjects completed the entire curriculum. 100 % of subjects completing the curriculum achieved passing scores on the post-training test. There was no statistically discernible difference in scores on the final FLS exam (FLS 93.4, ED-FLS 93.3, p = 0.98) or training sessions required to complete the curriculum (FLS 7.4, ED-FLS 9.8, p = 0.13). These results show that when applying mastery learning theory to an ergonomically different platform, skill transfer occurs at a high level and prepares subjects to pass the standard FLS skills test.
An accurate test for homogeneity of odds ratios based on Cochran's Q-statistic.
Kulinskaya, Elena; Dollinger, Michael B
2015-06-10
A frequently used statistic for testing homogeneity in a meta-analysis of K independent studies is Cochran's Q. For a standard test of homogeneity the Q statistic is referred to a chi-square distribution with K-1 degrees of freedom. For the situation in which the effects of the studies are logarithms of odds ratios, the chi-square distribution is much too conservative for moderate size studies, although it may be asymptotically correct as the individual studies become large. Using a mixture of theoretical results and simulations, we provide formulas to estimate the shape and scale parameters of a gamma distribution to fit the distribution of Q. Simulation studies show that the gamma distribution is a good approximation to the distribution for Q. Use of the gamma distribution instead of the chi-square distribution for Q should eliminate inaccurate inferences in assessing homogeneity in a meta-analysis. (A computer program for implementing this test is provided.) This hypothesis test is competitive with the Breslow-Day test both in accuracy of level and in power.
Liu, Yan; Yang, Dong; Xiong, Fen; Yu, Lan; Ji, Fei; Wang, Qiu-Ju
2015-09-01
Hearing loss affects more than 27 million people in mainland China. It would be helpful to develop a portable and self-testing audiometer for the timely detection of hearing loss so that the optimal clinical therapeutic schedule can be determined. The objective of this study was to develop a software-based hearing self-testing system. The software-based self-testing system consisted of a notebook computer, an external sound card, and a pair of 10-Ω insert earphones. The system could be used to test the hearing thresholds by individuals themselves in an interactive manner using software. The reliability and validity of the system at octave frequencies of 0.25 Hz to 8.0 kHz were analyzed in three series of experiments. Thirty-seven normal-hearing particpants (74 ears) were enrolled in experiment 1. Forty individuals (80 ears) with sensorineural hearing loss (SNHL) participated in experiment 2. Thirteen normal-hearing participants (26 ears) and 37 participants (74 ears) with SNHL were enrolled in experiment 3. Each participant was enrolled in only one of the three experiments. In all experiments, pure-tone audiometry in a sound insulation room (standard test) was regarded as the gold standard. SPSS for Windows, version 17.0, was used for statistical analysis. The paired t-test was used to compare the hearing thresholds between the standard test and software-based self-testing (self-test) in experiments 1 and 2. In experiment 3 (main study), one-way analysis of variance and post hoc comparisons were used to compare the hearing thresholds among the standard test and two rounds of the self-test. Linear correlation analysis was carried out for the self-tests performed twice. The concordance was analyzed between the standard test and the self-test using the kappa method. p < 0.05 was considered statistically significant. Experiments 1 and 2: The hearing thresholds determined by the two methods were not significantly different at frequencies of 250, 500, or 8000 Hz (p > 0.05) but were significantly different at frequencies of 1000, 2000, and 4000 Hz (p < 0.05), except for 1000 Hz in the right ear in experiment 2. Experiment 3: The hearing thresholds determined by the standard test and self-tests repeated twice were not significantly different at any frequency (p > 0.05). The overall sensitivity of the self-test method was 97.6%, and the specificity was 98.3%. The sensitivity was 97.6% and the specificity was 97% for the patients with SNHL. The self-test had significant concordance with the standard test (kappa value = 0.848, p < 0.001). This portable hearing self-testing system based on a notebook personal computer is a reliable and sensitive method for hearing threshold assessment and monitoring. American Academy of Audiology.
The use of analysis of variance procedures in biological studies
Williams, B.K.
1987-01-01
The analysis of variance (ANOVA) is widely used in biological studies, yet there remains considerable confusion among researchers about the interpretation of hypotheses being tested. Ambiguities arise when statistical designs are unbalanced, and in particular when not all combinations of design factors are represented in the data. This paper clarifies the relationship among hypothesis testing, statistical modelling and computing procedures in ANOVA for unbalanced data. A simple two-factor fixed effects design is used to illustrate three common parametrizations for ANOVA models, and some associations among these parametrizations are developed. Biologically meaningful hypotheses for main effects and interactions are given in terms of each parametrization, and procedures for testing the hypotheses are described. The standard statistical computing procedures in ANOVA are given along with their corresponding hypotheses. Throughout the development unbalanced designs are assumed and attention is given to problems that arise with missing cells.
Assessment of capillary suction time (CST) test methodologies.
Sawalha, O; Scholz, M
2007-12-01
The capillary suction time (CST) test is a commonly used method to measure the filterability and the easiness of removing moisture from slurry and sludge in numerous environmental and industrial applications. This study assessed several novel alterations of both the test methodology and the current standard capillary suction time (CST) apparatus. Twelve different papers including the standard Whatman No. 17 chromatographic paper were tested. The tests were run using four different types of sludge including a synthetic sludge, which was specifically developed for benchmarking purposes. The standard apparatus was altered by the introduction of a novel rectangular funnel instead of a standard circular one. A stirrer was also introduced to solve the problem of test inconsistency (e.g. high CST variability) particularly for heavy types of sludge. Results showed that several alternative papers, which are cheaper than the standard paper, can be used to estimate CST values accurately, and that the test repeatability can be improved in many cases and for different types of sludge. The introduction of the rectangular funnel demonstrated an obvious enhancement of test repeatability. The use of a stirrer to avoid sedimentation of heavy sludge did not have statistically significant impact on the CST values or the corresponding data variability. The application of synthetic sludge can support the testing of experimental methodologies and should be used for subsequent benchmarking purposes.
Comparison of Breast Density Between Synthesized Versus Standard Digital Mammography.
Haider, Irfanullah; Morgan, Matthew; McGow, Anna; Stein, Matthew; Rezvani, Maryam; Freer, Phoebe; Hu, Nan; Fajardo, Laurie; Winkler, Nicole
2018-06-12
To evaluate perceptual difference in breast density classification using synthesized mammography (SM) compared with standard or full-field digital mammography (FFDM) for screening. This institutional review board-approved, retrospective, multireader study evaluated breast density on 200 patients who underwent baseline screening mammogram during which both SM and FFDM were obtained contemporaneously from June 1, 2016, through November 30, 2016. Qualitative breast density was independently assigned by seven readers initially evaluating FFDM alone. Then, in a separate session, these same readers assigned breast density using synthetic views alone on the same 200 patients. The readers were again blinded to each other's assignment. Qualitative density assessment was based on BI-RADS fifth edition. Interreader agreement was evaluated with κ statistic using 95% confidence intervals. Testing for homogeneity in paired proportions was performed using McNemar's test with a level of significance of .05. For patients across the SM and standard 2-D data set, diagnostic testing with McNemar's test with P = 0.32 demonstrates that the minimal density transitions across FFDM and SM are not statistically significant density shifts. Taking clinical significance into account, only 8 of 200 (4%) patients had clinically significant transition (dense versus not dense). There was substantial interreader agreement with overall κ in FFDM of 0.71 (minimum 0.53, maximum 0.81) and overall SM κ average of 0.63 (minimum 0.56, maximum 0.87). Overall subjective breast density assignment by radiologists on SM is similar to density assignment on standard 2-D mammogram. Copyright © 2018 American College of Radiology. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Crain, Robert L.; Hawley, Willis D.
1982-01-01
Criticizes James Coleman's study, "Public and Private Schools," and points out methodological weaknesses in sampling, testing, data reliability, and statistical methods. Questions assumptions which have led to conclusions justifying federal support, especially tuition tax credits, to private schools. Raises the issue of ethical standards…
ERIC Educational Resources Information Center
Federal Trade Commission, Washington, DC. Bureau of Consumer Protection.
The effect of commercial coaching on Scholastic Aptitude Test (SAT) scores was analyzed, using 1974-1977 test results of 2,500 non-coached students and 1,568 enrollees in two coaching schools. (The Stanley H. Kaplan Educational Center, Inc., and the Test Preparation Center, Inc.). Multiple regression analysis was used to control for student…
Integrating Formal Methods and Testing 2002
NASA Technical Reports Server (NTRS)
Cukic, Bojan
2002-01-01
Traditionally, qualitative program verification methodologies and program testing are studied in separate research communities. None of them alone is powerful and practical enough to provide sufficient confidence in ultra-high reliability assessment when used exclusively. Significant advances can be made by accounting not only tho formal verification and program testing. but also the impact of many other standard V&V techniques, in a unified software reliability assessment framework. The first year of this research resulted in the statistical framework that, given the assumptions on the success of the qualitative V&V and QA procedures, significantly reduces the amount of testing needed to confidently assess reliability at so-called high and ultra-high levels (10-4 or higher). The coming years shall address the methodologies to realistically estimate the impacts of various V&V techniques to system reliability and include the impact of operational risk to reliability assessment. Combine formal correctness verification, process and product metrics, and other standard qualitative software assurance methods with statistical testing with the aim of gaining higher confidence in software reliability assessment for high-assurance applications. B) Quantify the impact of these methods on software reliability. C) Demonstrate that accounting for the effectiveness of these methods reduces the number of tests needed to attain certain confidence level. D) Quantify and justify the reliability estimate for systems developed using various methods.
NASA Astrophysics Data System (ADS)
Prejean-Harris, Rose M.
Over the last decade, accountability has been the driving force for many changes in education in the United States. One major educational reform effort is the standards-based movement with a focus of combining a number of processes that involve aligning curriculum, instruction, assessment and feedback to specific standards that are measureable and indicative of student achievement. The purpose of this study is to determine if the type of report card is a possible predictor of third grade student achievement on standardized tests in mathematics and science for the 2012 Criterion-Referenced Competency Test (CRCT). The results of this study concluded that the difference in test scores in mathematics and science for students in the traditional report card group was not statistically significant when compared to the scores of students in the standards-based report card group when controlling for poverty level, school locale, and school district. However, students in the traditional report card group scored an average of 1.01 point higher in mathematics and 2.27 points higher in science than students in the standards-based report card group.
Krefeld-Schwalb, Antonia; Witte, Erich H.; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H0-hypothesis to a statistical H1-verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a “pure” Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis. PMID:29740363
Krefeld-Schwalb, Antonia; Witte, Erich H; Zenker, Frank
2018-01-01
In psychology as elsewhere, the main statistical inference strategy to establish empirical effects is null-hypothesis significance testing (NHST). The recent failure to replicate allegedly well-established NHST-results, however, implies that such results lack sufficient statistical power, and thus feature unacceptably high error-rates. Using data-simulation to estimate the error-rates of NHST-results, we advocate the research program strategy (RPS) as a superior methodology. RPS integrates Frequentist with Bayesian inference elements, and leads from a preliminary discovery against a (random) H 0 -hypothesis to a statistical H 1 -verification. Not only do RPS-results feature significantly lower error-rates than NHST-results, RPS also addresses key-deficits of a "pure" Frequentist and a standard Bayesian approach. In particular, RPS aggregates underpowered results safely. RPS therefore provides a tool to regain the trust the discipline had lost during the ongoing replicability-crisis.
Exact goodness-of-fit tests for Markov chains.
Besag, J; Mondal, D
2013-06-01
Goodness-of-fit tests are useful in assessing whether a statistical model is consistent with available data. However, the usual χ² asymptotics often fail, either because of the paucity of the data or because a nonstandard test statistic is of interest. In this article, we describe exact goodness-of-fit tests for first- and higher order Markov chains, with particular attention given to time-reversible ones. The tests are obtained by conditioning on the sufficient statistics for the transition probabilities and are implemented by simple Monte Carlo sampling or by Markov chain Monte Carlo. They apply both to single and to multiple sequences and allow a free choice of test statistic. Three examples are given. The first concerns multiple sequences of dry and wet January days for the years 1948-1983 at Snoqualmie Falls, Washington State, and suggests that standard analysis may be misleading. The second one is for a four-state DNA sequence and lends support to the original conclusion that a second-order Markov chain provides an adequate fit to the data. The last one is six-state atomistic data arising in molecular conformational dynamics simulation of solvated alanine dipeptide and points to strong evidence against a first-order reversible Markov chain at 6 picosecond time steps. © 2013, The International Biometric Society.
Obuchowski, Nancy A; Buckler, Andrew; Kinahan, Paul; Chen-Mayer, Heather; Petrick, Nicholas; Barboriak, Daniel P; Bullen, Jennifer; Barnhart, Huiman; Sullivan, Daniel C
2016-04-01
A major initiative of the Quantitative Imaging Biomarker Alliance is to develop standards-based documents called "Profiles," which describe one or more technical performance claims for a given imaging modality. The term "actor" denotes any entity (device, software, or person) whose performance must meet certain specifications for the claim to be met. The objective of this paper is to present the statistical issues in testing actors' conformance with the specifications. In particular, we present the general rationale and interpretation of the claims, the minimum requirements for testing whether an actor achieves the performance requirements, the study designs used for testing conformity, and the statistical analysis plan. We use three examples to illustrate the process: apparent diffusion coefficient in solid tumors measured by MRI, change in Perc 15 as a biomarker for the progression of emphysema, and percent change in solid tumor volume by computed tomography as a biomarker for lung cancer progression. Copyright © 2016 The Association of University Radiologists. All rights reserved.
Harnessing Multivariate Statistics for Ellipsoidal Data in Structural Geology
NASA Astrophysics Data System (ADS)
Roberts, N.; Davis, J. R.; Titus, S.; Tikoff, B.
2015-12-01
Most structural geology articles do not state significance levels, report confidence intervals, or perform regressions to find trends. This is, in part, because structural data tend to include directions, orientations, ellipsoids, and tensors, which are not treatable by elementary statistics. We describe a full procedural methodology for the statistical treatment of ellipsoidal data. We use a reconstructed dataset of deformed ooids in Maryland from Cloos (1947) to illustrate the process. Normalized ellipsoids have five degrees of freedom and can be represented by a second order tensor. This tensor can be permuted into a five dimensional vector that belongs to a vector space and can be treated with standard multivariate statistics. Cloos made several claims about the distribution of deformation in the South Mountain fold, Maryland, and we reexamine two particular claims using hypothesis testing: 1) octahedral shear strain increases towards the axial plane of the fold; 2) finite strain orientation varies systematically along the trend of the axial trace as it bends with the Appalachian orogen. We then test the null hypothesis that the southern segment of South Mountain is the same as the northern segment. This test illustrates the application of ellipsoidal statistics, which combine both orientation and shape. We report confidence intervals for each test, and graphically display our results with novel plots. This poster illustrates the importance of statistics in structural geology, especially when working with noisy or small datasets.
Intercorrelations of Anthropometric Measurements: A Source Book for USA Data
1978-05-01
and most important of the statistical measures after the arithmetic mean and standard devi- ation. The coefficient was devised and developed by Francis ... Galton and Karl Pearson in the last decades of the nineteenth century as a measure of the degree of interrelationship or concomitant variation of a...paragraphs--in a wide variety of formulas such as ones for tests of statistical significance and for discriminant functions. Correlation coefficients are
Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha
2017-01-01
The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Rochon, Justine; Kieser, Meinhard
2011-11-01
Student's one-sample t-test is a commonly used method when inference about the population mean is made. As advocated in textbooks and articles, the assumption of normality is often checked by a preliminary goodness-of-fit (GOF) test. In a paper recently published by Schucany and Ng it was shown that, for the uniform distribution, screening of samples by a pretest for normality leads to a more conservative conditional Type I error rate than application of the one-sample t-test without preliminary GOF test. In contrast, for the exponential distribution, the conditional level is even more elevated than the Type I error rate of the t-test without pretest. We examine the reasons behind these characteristics. In a simulation study, samples drawn from the exponential, lognormal, uniform, Student's t-distribution with 2 degrees of freedom (t(2) ) and the standard normal distribution that had passed normality screening, as well as the ingredients of the test statistics calculated from these samples, are investigated. For non-normal distributions, we found that preliminary testing for normality may change the distribution of means and standard deviations of the selected samples as well as the correlation between them (if the underlying distribution is non-symmetric), thus leading to altered distributions of the resulting test statistics. It is shown that for skewed distributions the excess in Type I error rate may be even more pronounced when testing one-sided hypotheses. ©2010 The British Psychological Society.
Evaluation of "e-rater"® for the "Praxis I"®Writing Test. Research Report. ETS RR-15-03
ERIC Educational Resources Information Center
Ramineni, Chaitanya; Trapani, Catherine S.; Williamson, David M.
2015-01-01
Automated scoring models were trained and evaluated for the essay task in the "Praxis I"® writing test. Prompt-specific and generic "e-rater"® scoring models were built, and evaluation statistics, such as quadratic weighted kappa, Pearson correlation, and standardized differences in mean scores, were examined to evaluate the…
Advanced Combat Helmet Technical Assessment
2013-05-29
Lastly, we assessed the participation of various stakeholders and industry experts such as active ACH manufacturers and test facilities. Findings... industrially accepted American National Standards Institute (ANSI Z1.4-2008, Sampling Visit us on the web at www.dodig.mil Results in Brief Advanced...statistically principled approach and the lot acceptance test protocol adopts a widely established and industrially accepted sampling procedure. We
BTS statistical standards manual
DOT National Transportation Integrated Search
2005-10-01
The Bureau of Transportation Statistics (BTS), like other federal statistical agencies, establishes professional standards to guide the methods and procedures for the collection, processing, storage, and presentation of statistical data. Standards an...
Quantitative imaging biomarkers: a review of statistical methods for computer algorithm comparisons.
Obuchowski, Nancy A; Reeves, Anthony P; Huang, Erich P; Wang, Xiao-Feng; Buckler, Andrew J; Kim, Hyun J Grace; Barnhart, Huiman X; Jackson, Edward F; Giger, Maryellen L; Pennello, Gene; Toledano, Alicia Y; Kalpathy-Cramer, Jayashree; Apanasovich, Tatiyana V; Kinahan, Paul E; Myers, Kyle J; Goldgof, Dmitry B; Barboriak, Daniel P; Gillies, Robert J; Schwartz, Lawrence H; Sullivan, Daniel C
2015-02-01
Quantitative biomarkers from medical images are becoming important tools for clinical diagnosis, staging, monitoring, treatment planning, and development of new therapies. While there is a rich history of the development of quantitative imaging biomarker (QIB) techniques, little attention has been paid to the validation and comparison of the computer algorithms that implement the QIB measurements. In this paper we provide a framework for QIB algorithm comparisons. We first review and compare various study designs, including designs with the true value (e.g. phantoms, digital reference images, and zero-change studies), designs with a reference standard (e.g. studies testing equivalence with a reference standard), and designs without a reference standard (e.g. agreement studies and studies of algorithm precision). The statistical methods for comparing QIB algorithms are then presented for various study types using both aggregate and disaggregate approaches. We propose a series of steps for establishing the performance of a QIB algorithm, identify limitations in the current statistical literature, and suggest future directions for research. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Loring, David W; Larrabee, Glenn J
2006-06-01
The Halstead-Reitan Battery has been instrumental in the development of neuropsychological practice in the United States. Although Reitan administered both the Wechsler-Bellevue Intelligence Scale and Halstead's test battery when evaluating Halstead's theory of biologic intelligence, the relative sensitivity of each test battery to brain damage continues to be an area of controversy. Because Reitan did not perform direct parametric analysis to contrast group performances, we reanalyze Reitan's original validation data from both Halstead (Reitan, 1955) and Wechsler batteries (Reitan, 1959a) and calculate effect sizes and probability levels using traditional parametric approaches. Eight of the 10 tests comprising Halstead's original Impairment Index, as well as the Impairment Index itself, statistically differentiated patients with unequivocal brain damage from controls. In addition, 13 of 14 Wechsler measures including Full-Scale IQ also differed statistically between groups (Brain Damage Full-Scale IQ = 96.2; Control Group Full Scale IQ = 112.6). We suggest that differences in the statistical properties of each battery (e.g., raw scores vs. standardized scores) likely contribute to classification characteristics including test sensitivity and specificity.
Statistical power comparisons at 3T and 7T with a GO / NOGO task.
Torrisi, Salvatore; Chen, Gang; Glen, Daniel; Bandettini, Peter A; Baker, Chris I; Reynolds, Richard; Yen-Ting Liu, Jeffrey; Leshin, Joseph; Balderston, Nicholas; Grillon, Christian; Ernst, Monique
2018-07-15
The field of cognitive neuroscience is weighing evidence about whether to move from standard field strength to ultra-high field (UHF). The present study contributes to the evidence by comparing a cognitive neuroscience paradigm at 3 Tesla (3T) and 7 Tesla (7T). The goal was to test and demonstrate the practical effects of field strength on a standard GO/NOGO task using accessible preprocessing and analysis tools. Two independent matched healthy samples (N = 31 each) were analyzed at 3T and 7T. Results show gains at 7T in statistical strength, the detection of smaller effects and group-level power. With an increased availability of UHF scanners, these gains may be exploited by cognitive neuroscientists and other neuroimaging researchers to develop more efficient or comprehensive experimental designs and, given the same sample size, achieve greater statistical power at 7T. Published by Elsevier Inc.
Velocity bias in the distribution of dark matter halos
NASA Astrophysics Data System (ADS)
Baldauf, Tobias; Desjacques, Vincent; Seljak, Uroš
2015-12-01
The standard formalism for the coevolution of halos and dark matter predicts that any initial halo velocity bias rapidly decays to zero. We argue that, when the purpose is to compute statistics like power spectra etc., the coupling in the momentum conservation equation for the biased tracers must be modified. Our new formulation predicts the constancy in time of any statistical halo velocity bias present in the initial conditions, in agreement with peak theory. We test this prediction by studying the evolution of a conserved halo population in N -body simulations. We establish that the initial simulated halo density and velocity statistics show distinct features of the peak model and, thus, deviate from the simple local Lagrangian bias. We demonstrate, for the first time, that the time evolution of their velocity is in tension with the rapid decay expected in the standard approach.
Explorations in statistics: the log transformation.
Curran-Everett, Douglas
2018-06-01
Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This thirteenth installment of Explorations in Statistics explores the log transformation, an established technique that rescales the actual observations from an experiment so that the assumptions of some statistical analysis are better met. A general assumption in statistics is that the variability of some response Y is homogeneous across groups or across some predictor variable X. If the variability-the standard deviation-varies in rough proportion to the mean value of Y, a log transformation can equalize the standard deviations. Moreover, if the actual observations from an experiment conform to a skewed distribution, then a log transformation can make the theoretical distribution of the sample mean more consistent with a normal distribution. This is important: the results of a one-sample t test are meaningful only if the theoretical distribution of the sample mean is roughly normal. If we log-transform our observations, then we want to confirm the transformation was useful. We can do this if we use the Box-Cox method, if we bootstrap the sample mean and the statistic t itself, and if we assess the residual plots from the statistical model of the actual and transformed sample observations.
Hoyer, Annika; Kuss, Oliver
2018-05-01
Meta-analysis of diagnostic studies is still a rapidly developing area of biostatistical research. Especially, there is an increasing interest in methods to compare different diagnostic tests to a common gold standard. Restricting to the case of two diagnostic tests, in these meta-analyses the parameters of interest are the differences of sensitivities and specificities (with their corresponding confidence intervals) between the two diagnostic tests while accounting for the various associations across single studies and between the two tests. We propose statistical models with a quadrivariate response (where sensitivity of test 1, specificity of test 1, sensitivity of test 2, and specificity of test 2 are the four responses) as a sensible approach to this task. Using a quadrivariate generalized linear mixed model naturally generalizes the common standard bivariate model of meta-analysis for a single diagnostic test. If information on several thresholds of the tests is available, the quadrivariate model can be further generalized to yield a comparison of full receiver operating characteristic (ROC) curves. We illustrate our model by an example where two screening methods for the diagnosis of type 2 diabetes are compared.
Khachatryan, Vardan
2015-05-13
The standard model of particle physics describes the fundamental particles and their interactions via the strong, electromagnetic and weak forces. It provides precise predictions for measurable quantities that can be tested experimentally. We foudn that the probabilities, or branching fractions, of the strange B meson (B 0 2 ) and the B 0 meson decaying into two oppositely charged muons (μ + and μ -) are especially interesting because of their sensitivity to theories that extend the standard model. The standard model predicts that the B 0 2 → μ + and μ - and (B 0 → μ +more » and μ - decays are very rare, with about four of the former occurring for every billion mesons produced, and one of the latter occurring for every ten billion B 0 mesons1. A difference in the observed branching fractions with respect to the predictions of the standard model would provide a direction in which the standard model should be extended. Before the Large Hadron Collider (LHC) at CERN2 started operating, no evidence for either decay mode had been found. Upper limits on the branching fractions were an order of magnitude above the standard model predictions. The CMS (Compact Muon Solenoid) and LHCb (Large Hadron Collider beauty) collaborations have performed a joint analysis of the data from proton–proton collisions that they collected in 2011 at a centre-of-mass energy of seven teraelectronvolts and in 2012 at eight teraelectronvolts. Here we report the first observation of the μ + and μ -decay, with a statistical significance exceeding six standard deviations, and the best measurement so far of its branching fraction. We then obtained evidence for the B 0 → μ + and μ - decay with a statistical significance of three standard deviations. Both measurements are statistically compatible with standard model predictions and allow stringent constraints to be placed on theories beyond the standard model. The LHC experiments will resume taking data in 2015, recording proton–proton collisions at a centre-of-mass energy of 13 teraelectronvolts, which will approximately double the production rates of B 0 2 and B 0 mesons and lead to further improvements in the precision of these crucial tests of the standard model.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khachatryan, Vardan
The standard model of particle physics describes the fundamental particles and their interactions via the strong, electromagnetic and weak forces. It provides precise predictions for measurable quantities that can be tested experimentally. We foudn that the probabilities, or branching fractions, of the strange B meson (B 0 2 ) and the B 0 meson decaying into two oppositely charged muons (μ + and μ -) are especially interesting because of their sensitivity to theories that extend the standard model. The standard model predicts that the B 0 2 → μ + and μ - and (B 0 → μ +more » and μ - decays are very rare, with about four of the former occurring for every billion mesons produced, and one of the latter occurring for every ten billion B 0 mesons1. A difference in the observed branching fractions with respect to the predictions of the standard model would provide a direction in which the standard model should be extended. Before the Large Hadron Collider (LHC) at CERN2 started operating, no evidence for either decay mode had been found. Upper limits on the branching fractions were an order of magnitude above the standard model predictions. The CMS (Compact Muon Solenoid) and LHCb (Large Hadron Collider beauty) collaborations have performed a joint analysis of the data from proton–proton collisions that they collected in 2011 at a centre-of-mass energy of seven teraelectronvolts and in 2012 at eight teraelectronvolts. Here we report the first observation of the μ + and μ -decay, with a statistical significance exceeding six standard deviations, and the best measurement so far of its branching fraction. We then obtained evidence for the B 0 → μ + and μ - decay with a statistical significance of three standard deviations. Both measurements are statistically compatible with standard model predictions and allow stringent constraints to be placed on theories beyond the standard model. The LHC experiments will resume taking data in 2015, recording proton–proton collisions at a centre-of-mass energy of 13 teraelectronvolts, which will approximately double the production rates of B 0 2 and B 0 mesons and lead to further improvements in the precision of these crucial tests of the standard model.« less
Observation of the rare B(s)(0) →µ+µ− decay from the combined analysis of CMS and LHCb data.
2015-06-04
The standard model of particle physics describes the fundamental particles and their interactions via the strong, electromagnetic and weak forces. It provides precise predictions for measurable quantities that can be tested experimentally. The probabilities, or branching fractions, of the strange B meson (B(s)(0)) and the B0 meson decaying into two oppositely charged muons (μ+ and μ−) are especially interesting because of their sensitivity to theories that extend the standard model. The standard model predicts that the B(s)(0) →µ+µ− and B(0) →µ+µ− decays are very rare, with about four of the former occurring for every billion mesons produced, and one of the latter occurring for every ten billion B0 mesons. A difference in the observed branching fractions with respect to the predictions of the standard model would provide a direction in which the standard model should be extended. Before the Large Hadron Collider (LHC) at CERN started operating, no evidence for either decay mode had been found. Upper limits on the branching fractions were an order of magnitude above the standard model predictions. The CMS (Compact Muon Solenoid) and LHCb (Large Hadron Collider beauty) collaborations have performed a joint analysis of the data from proton–proton collisions that they collected in 2011 at a centre-of-mass energy of seven teraelectronvolts and in 2012 at eight teraelectronvolts. Here we report the first observation of the B(s)(0) → µ+µ− decay, with a statistical significance exceeding six standard deviations, and the best measurement so far of its branching fraction. Furthermore, we obtained evidence for the B(0) → µ+µ− decay with a statistical significance of three standard deviations. Both measurements are statistically compatible with standard model predictions and allow stringent constraints to be placed on theories beyond the standard model. The LHC experiments will resume taking data in 2015, recording proton–proton collisions at a centre-of-mass energy of 13 teraelectronvolts, which will approximately double the production rates of B(s)(0) and B0 mesons and lead to further improvements in the precision of these crucial tests of the standard model.
Powerful Statistical Inference for Nested Data Using Sufficient Summary Statistics
Dowding, Irene; Haufe, Stefan
2018-01-01
Hierarchically-organized data arise naturally in many psychology and neuroscience studies. As the standard assumption of independent and identically distributed samples does not hold for such data, two important problems are to accurately estimate group-level effect sizes, and to obtain powerful statistical tests against group-level null hypotheses. A common approach is to summarize subject-level data by a single quantity per subject, which is often the mean or the difference between class means, and treat these as samples in a group-level t-test. This “naive” approach is, however, suboptimal in terms of statistical power, as it ignores information about the intra-subject variance. To address this issue, we review several approaches to deal with nested data, with a focus on methods that are easy to implement. With what we call the sufficient-summary-statistic approach, we highlight a computationally efficient technique that can improve statistical power by taking into account within-subject variances, and we provide step-by-step instructions on how to apply this approach to a number of frequently-used measures of effect size. The properties of the reviewed approaches and the potential benefits over a group-level t-test are quantitatively assessed on simulated data and demonstrated on EEG data from a simulated-driving experiment. PMID:29615885
Gauging Skills of Hospital Security Personnel: a Statistically-driven, Questionnaire-based Approach.
Rinkoo, Arvind Vashishta; Mishra, Shubhra; Rahesuddin; Nabi, Tauqeer; Chandra, Vidha; Chandra, Hem
2013-01-01
This study aims to gauge the technical and soft skills of the hospital security personnel so as to enable prioritization of their training needs. A cross sectional questionnaire based study was conducted in December 2011. Two separate predesigned and pretested questionnaires were used for gauging soft skills and technical skills of the security personnel. Extensive statistical analysis, including Multivariate Analysis (Pillai-Bartlett trace along with Multi-factorial ANOVA) and Post-hoc Tests (Bonferroni Test) was applied. The 143 participants performed better on the soft skills front with an average score of 6.43 and standard deviation of 1.40. The average technical skills score was 5.09 with a standard deviation of 1.44. The study avowed a need for formal hands on training with greater emphasis on technical skills. Multivariate analysis of the available data further helped in identifying 20 security personnel who should be prioritized for soft skills training and a group of 36 security personnel who should receive maximum attention during technical skills training. This statistically driven approach can be used as a prototype by healthcare delivery institutions worldwide, after situation specific customizations, to identify the training needs of any category of healthcare staff.
Gauging Skills of Hospital Security Personnel: a Statistically-driven, Questionnaire-based Approach
Rinkoo, Arvind Vashishta; Mishra, Shubhra; Rahesuddin; Nabi, Tauqeer; Chandra, Vidha; Chandra, Hem
2013-01-01
Objectives This study aims to gauge the technical and soft skills of the hospital security personnel so as to enable prioritization of their training needs. Methodology A cross sectional questionnaire based study was conducted in December 2011. Two separate predesigned and pretested questionnaires were used for gauging soft skills and technical skills of the security personnel. Extensive statistical analysis, including Multivariate Analysis (Pillai-Bartlett trace along with Multi-factorial ANOVA) and Post-hoc Tests (Bonferroni Test) was applied. Results The 143 participants performed better on the soft skills front with an average score of 6.43 and standard deviation of 1.40. The average technical skills score was 5.09 with a standard deviation of 1.44. The study avowed a need for formal hands on training with greater emphasis on technical skills. Multivariate analysis of the available data further helped in identifying 20 security personnel who should be prioritized for soft skills training and a group of 36 security personnel who should receive maximum attention during technical skills training. Conclusion This statistically driven approach can be used as a prototype by healthcare delivery institutions worldwide, after situation specific customizations, to identify the training needs of any category of healthcare staff. PMID:23559904
SPSS macros to compare any two fitted values from a regression model.
Weaver, Bruce; Dubois, Sacha
2012-12-01
In regression models with first-order terms only, the coefficient for a given variable is typically interpreted as the change in the fitted value of Y for a one-unit increase in that variable, with all other variables held constant. Therefore, each regression coefficient represents the difference between two fitted values of Y. But the coefficients represent only a fraction of the possible fitted value comparisons that might be of interest to researchers. For many fitted value comparisons that are not captured by any of the regression coefficients, common statistical software packages do not provide the standard errors needed to compute confidence intervals or carry out statistical tests-particularly in more complex models that include interactions, polynomial terms, or regression splines. We describe two SPSS macros that implement a matrix algebra method for comparing any two fitted values from a regression model. The !OLScomp and !MLEcomp macros are for use with models fitted via ordinary least squares and maximum likelihood estimation, respectively. The output from the macros includes the standard error of the difference between the two fitted values, a 95% confidence interval for the difference, and a corresponding statistical test with its p-value.
Mousa, Mohammad F; Cubbidge, Robert P; Al-Mansouri, Fatima; Bener, Abdulbari
2014-02-01
Multifocal visual evoked potential (mfVEP) is a newly introduced method used for objective visual field assessment. Several analysis protocols have been tested to identify early visual field losses in glaucoma patients using the mfVEP technique, some were successful in detection of field defects, which were comparable to the standard automated perimetry (SAP) visual field assessment, and others were not very informative and needed more adjustment and research work. In this study we implemented a novel analysis approach and evaluated its validity and whether it could be used effectively for early detection of visual field defects in glaucoma. Three groups were tested in this study; normal controls (38 eyes), glaucoma patients (36 eyes) and glaucoma suspect patients (38 eyes). All subjects had a two standard Humphrey field analyzer (HFA) test 24-2 and a single mfVEP test undertaken in one session. Analysis of the mfVEP results was done using the new analysis protocol; the hemifield sector analysis (HSA) protocol. Analysis of the HFA was done using the standard grading system. Analysis of mfVEP results showed that there was a statistically significant difference between the three groups in the mean signal to noise ratio (ANOVA test, p < 0.001 with a 95% confidence interval). The difference between superior and inferior hemispheres in all subjects were statistically significant in the glaucoma patient group in all 11 sectors (t-test, p < 0.001), partially significant in 5 / 11 (t-test, p < 0.01), and no statistical difference in most sectors of the normal group (1 / 11 sectors was significant, t-test, p < 0.9). Sensitivity and specificity of the HSA protocol in detecting glaucoma was 97% and 86%, respectively, and for glaucoma suspect patients the values were 89% and 79%, respectively. The new HSA protocol used in the mfVEP testing can be applied to detect glaucomatous visual field defects in both glaucoma and glaucoma suspect patients. Using this protocol can provide information about focal visual field differences across the horizontal midline, which can be utilized to differentiate between glaucoma and normal subjects. Sensitivity and specificity of the mfVEP test showed very promising results and correlated with other anatomical changes in glaucoma field loss.
Mousa, Mohammad F.; Cubbidge, Robert P.; Al-Mansouri, Fatima
2014-01-01
Purpose Multifocal visual evoked potential (mfVEP) is a newly introduced method used for objective visual field assessment. Several analysis protocols have been tested to identify early visual field losses in glaucoma patients using the mfVEP technique, some were successful in detection of field defects, which were comparable to the standard automated perimetry (SAP) visual field assessment, and others were not very informative and needed more adjustment and research work. In this study we implemented a novel analysis approach and evaluated its validity and whether it could be used effectively for early detection of visual field defects in glaucoma. Methods Three groups were tested in this study; normal controls (38 eyes), glaucoma patients (36 eyes) and glaucoma suspect patients (38 eyes). All subjects had a two standard Humphrey field analyzer (HFA) test 24-2 and a single mfVEP test undertaken in one session. Analysis of the mfVEP results was done using the new analysis protocol; the hemifield sector analysis (HSA) protocol. Analysis of the HFA was done using the standard grading system. Results Analysis of mfVEP results showed that there was a statistically significant difference between the three groups in the mean signal to noise ratio (ANOVA test, p < 0.001 with a 95% confidence interval). The difference between superior and inferior hemispheres in all subjects were statistically significant in the glaucoma patient group in all 11 sectors (t-test, p < 0.001), partially significant in 5 / 11 (t-test, p < 0.01), and no statistical difference in most sectors of the normal group (1 / 11 sectors was significant, t-test, p < 0.9). Sensitivity and specificity of the HSA protocol in detecting glaucoma was 97% and 86%, respectively, and for glaucoma suspect patients the values were 89% and 79%, respectively. Conclusions The new HSA protocol used in the mfVEP testing can be applied to detect glaucomatous visual field defects in both glaucoma and glaucoma suspect patients. Using this protocol can provide information about focal visual field differences across the horizontal midline, which can be utilized to differentiate between glaucoma and normal subjects. Sensitivity and specificity of the mfVEP test showed very promising results and correlated with other anatomical changes in glaucoma field loss. PMID:24511212
Riedl, Verena; Agatz, Annika; Benstead, Rachel; Ashauer, Roman
2018-04-01
Chemical impacts on the environment are routinely assessed in single-species tests. They are employed to measure direct effects on nontarget organisms, but indirect effects on ecological interactions can only be detected in multispecies tests. Micro- and mesocosms are more complex and environmentally realistic, yet they are less frequently used for environmental risk assessment because resource demand is high, whereas repeatability and statistical power are often low. Test systems fulfilling regulatory needs (i.e., standardization, repeatability, and replication) and the assessment of impacts on species interactions and indirect effects are lacking. In the present study we describe the development of the TriCosm, a repeatable aquatic multispecies test with 3 trophic levels and increased statistical power. High repeatability of community dynamics of 3 interacting aquatic populations (algae, Ceriodaphnia, and Hydra) was found with an average coefficient of variation of 19.5% and the ability to determine small effect sizes. The TriCosm combines benefits of both single-species tests (fulfillment of regulatory requirements) and complex multispecies tests (ecological relevance) and can be used, for instance, at an intermediate tier in environmental risk assessment. Furthermore, comparatively quickly generated population and community toxicity data can be useful for the development and testing of mechanistic effect models. Environ Toxicol Chem 2018;37:1051-1060. © 2017 SETAC. © 2017 SETAC.
Effect of Accreditation on Accuracy of Diagnostic Tests in Medical Laboratories.
Jang, Mi Ae; Yoon, Young Ahn; Song, Junghan; Kim, Jeong Ho; Min, Won Ki; Lee, Ji Sung; Lee, Yong Wha; Lee, You Kyoung
2017-05-01
Medical laboratories play a central role in health care. Many laboratories are taking a more focused and stringent approach to quality system management. In Korea, laboratory standardization efforts undertaken by the Korean Laboratory Accreditation Program (KLAP) and the Korean External Quality Assessment Scheme (KEQAS) may have facilitated an improvement in laboratory performance, but there are no fundamental studies demonstrating that laboratory standardization is effective. We analyzed the results of the KEQAS to identify significant differences between laboratories with or without KLAP and to determine the impact of laboratory standardization on the accuracy of diagnostic tests. We analyzed KEQAS participant data on clinical chemistry tests such as albumin, ALT, AST, and glucose from 2010 to 2013. As a statistical parameter to assess performance bias between laboratories, we compared 4-yr variance index score (VIS) between the two groups with or without KLAP. Compared with the group without KLAP, the group with KLAP exhibited significantly lower geometric means of 4-yr VIS for all clinical chemistry tests (P<0.0001); this difference justified a high level of confidence in standardized services provided by accredited laboratories. Confidence intervals for the mean of each test in the two groups (accredited and non-accredited) did not overlap, suggesting that the means of the groups are significantly different. These results confirmed that practice standardization is strongly associated with the accuracy of test results. Our study emphasizes the necessity of establishing a system for standardization of diagnostic testing. © The Korean Society for Laboratory Medicine
Effect of Accreditation on Accuracy of Diagnostic Tests in Medical Laboratories
Jang, Mi-Ae; Yoon, Young Ahn; Song, Junghan; Kim, Jeong-Ho; Min, Won-Ki; Lee, Ji Sung
2017-01-01
Background Medical laboratories play a central role in health care. Many laboratories are taking a more focused and stringent approach to quality system management. In Korea, laboratory standardization efforts undertaken by the Korean Laboratory Accreditation Program (KLAP) and the Korean External Quality Assessment Scheme (KEQAS) may have facilitated an improvement in laboratory performance, but there are no fundamental studies demonstrating that laboratory standardization is effective. We analyzed the results of the KEQAS to identify significant differences between laboratories with or without KLAP and to determine the impact of laboratory standardization on the accuracy of diagnostic tests. Methods We analyzed KEQAS participant data on clinical chemistry tests such as albumin, ALT, AST, and glucose from 2010 to 2013. As a statistical parameter to assess performance bias between laboratories, we compared 4-yr variance index score (VIS) between the two groups with or without KLAP. Results Compared with the group without KLAP, the group with KLAP exhibited significantly lower geometric means of 4-yr VIS for all clinical chemistry tests (P<0.0001); this difference justified a high level of confidence in standardized services provided by accredited laboratories. Confidence intervals for the mean of each test in the two groups (accredited and non-accredited) did not overlap, suggesting that the means of the groups are significantly different. Conclusions These results confirmed that practice standardization is strongly associated with the accuracy of test results. Our study emphasizes the necessity of establishing a system for standardization of diagnostic testing. PMID:28224767
Estimation of the geochemical threshold and its statistical significance
Miesch, A.T.
1981-01-01
A statistic is proposed for estimating the geochemical threshold and its statistical significance, or it may be used to identify a group of extreme values that can be tested for significance by other means. The statistic is the maximum gap between adjacent values in an ordered array after each gap has been adjusted for the expected frequency. The values in the ordered array are geochemical values transformed by either ln(?? - ??) or ln(?? - ??) and then standardized so that the mean is zero and the variance is unity. The expected frequency is taken from a fitted normal curve with unit area. The midpoint of an adjusted gap that exceeds the corresponding critical value may be taken as an estimate of the geochemical threshold, and the associated probability indicates the likelihood that the threshold separates two geochemical populations. The adjusted gap test may fail to identify threshold values if the variation tends to be continuous from background values to the higher values that reflect mineralized ground. However, the test will serve to identify other anomalies that may be too subtle to have been noted by other means. ?? 1981.
Identification of differentially expressed genes and false discovery rate in microarray studies.
Gusnanto, Arief; Calza, Stefano; Pawitan, Yudi
2007-04-01
To highlight the development in microarray data analysis for the identification of differentially expressed genes, particularly via control of false discovery rate. The emergence of high-throughput technology such as microarrays raises two fundamental statistical issues: multiplicity and sensitivity. We focus on the biological problem of identifying differentially expressed genes. First, multiplicity arises due to testing tens of thousands of hypotheses, rendering the standard P value meaningless. Second, known optimal single-test procedures such as the t-test perform poorly in the context of highly multiple tests. The standard approach of dealing with multiplicity is too conservative in the microarray context. The false discovery rate concept is fast becoming the key statistical assessment tool replacing the P value. We review the false discovery rate approach and argue that it is more sensible for microarray data. We also discuss some methods to take into account additional information from the microarrays to improve the false discovery rate. There is growing consensus on how to analyse microarray data using the false discovery rate framework in place of the classical P value. Further research is needed on the preprocessing of the raw data, such as the normalization step and filtering, and on finding the most sensitive test procedure.
Accurate computation of survival statistics in genome-wide studies.
Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J; Upfal, Eli
2015-05-01
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations.
Accurate Computation of Survival Statistics in Genome-Wide Studies
Vandin, Fabio; Papoutsaki, Alexandra; Raphael, Benjamin J.; Upfal, Eli
2015-01-01
A key challenge in genomics is to identify genetic variants that distinguish patients with different survival time following diagnosis or treatment. While the log-rank test is widely used for this purpose, nearly all implementations of the log-rank test rely on an asymptotic approximation that is not appropriate in many genomics applications. This is because: the two populations determined by a genetic variant may have very different sizes; and the evaluation of many possible variants demands highly accurate computation of very small p-values. We demonstrate this problem for cancer genomics data where the standard log-rank test leads to many false positive associations between somatic mutations and survival time. We develop and analyze a novel algorithm, Exact Log-rank Test (ExaLT), that accurately computes the p-value of the log-rank statistic under an exact distribution that is appropriate for any size populations. We demonstrate the advantages of ExaLT on data from published cancer genomics studies, finding significant differences from the reported p-values. We analyze somatic mutations in six cancer types from The Cancer Genome Atlas (TCGA), finding mutations with known association to survival as well as several novel associations. In contrast, standard implementations of the log-rank test report dozens-hundreds of likely false positive associations as more significant than these known associations. PMID:25950620
Tang, Liang; Feng, Shiqing; Gao, Ruixiao; Han, Chenfu; Sun, Xiaochen; Bao, Yucheng; Zhang, Wenlong
2017-12-01
The aim of the present study was to compare the efficacy of the commercial Xpert Mycobacterium tuberculosis/rifampin (MTB/RIF) test for evaluating different types of spinal tuberculosis (TB) tissue specimens. Pus, granulation tissue, and caseous necrotic tissue specimens from 223 patients who were diagnosed with spinal TB and who underwent curettage were collected for bacterial culture and the Xpert MTB/RIF assay to calculate the positive rate. Bacterial culture and phenotypic drug sensitivity testing (pDST) were adopted as the gold standards to calculate the sensitivity and specificity of the Xpert bacterial detection and drug resistance (DR) test. The positive rate (68.61% ± 7.35%) from the Xpert MTB/RIF assays of spinal TB patients' tissue specimens was higher compared with bacterial culture (44.39% ± 6.51%, Z = 5.1642, p < 0.01), and the positive rates from Xpert MTB/RIF assays on the three types of specimens were all higher than those of bacterial culture, with statistically significant results for pus and granulation tissue specimens. The positive rates for pus using the two bacteriological tests were higher than those for granulation tissue but were not statistically significant. However, the positive rates obtained from granulation tissue were statistically significantly higher than those obtained from caseous necrotic tissue. With bacterial culture and pDST as the gold standards, the sensitivity of Xpert MTB/RIF assays for MTB was 96.97%, while the sensitivity and specificity of the DR test also remained relatively high. For efficient and accurate diagnosis of spinal TB and DR and timely provision of effective treatment, multiple specimens, especially the pus of spinal TB patients, should be collected for Xpert MTB/RIF assays.
Wiuf, Carsten; Schaumburg-Müller Pallesen, Jonatan; Foldager, Leslie; Grove, Jakob
2016-08-01
In many areas of science it is custom to perform many, potentially millions, of tests simultaneously. To gain statistical power it is common to group tests based on a priori criteria such as predefined regions or by sliding windows. However, it is not straightforward to choose grouping criteria and the results might depend on the chosen criteria. Methods that summarize, or aggregate, test statistics or p-values, without relying on a priori criteria, are therefore desirable. We present a simple method to aggregate a sequence of stochastic variables, such as test statistics or p-values, into fewer variables without assuming a priori defined groups. We provide different ways to evaluate the significance of the aggregated variables based on theoretical considerations and resampling techniques, and show that under certain assumptions the FWER is controlled in the strong sense. Validity of the method was demonstrated using simulations and real data analyses. Our method may be a useful supplement to standard procedures relying on evaluation of test statistics individually. Moreover, by being agnostic and not relying on predefined selected regions, it might be a practical alternative to conventionally used methods of aggregation of p-values over regions. The method is implemented in Python and freely available online (through GitHub, see the Supplementary information).
The Effects of Using Space to Teach Standard Elementary School Curriculum
NASA Technical Reports Server (NTRS)
Ewell, Robert N.
1996-01-01
This brief report and recommendation for further research brings to a formal close this effort, the original purpose of which is described in detail in The effects of using space to teach standard elementary school curriculum, Volume 1, included here as the Appendix. Volume 1 describes the project as a 3-year research program to determine the effectiveness of using space to teach. The research design is quasi experimental using standardized test data on students from Aldrin Elementary School and a District-identified 'control' school, which shall be referred to as 'School B.' Students now in fourth through sixth grades will be compared now (after one year at Aldrin) and tracked at least until the present sixth graders are through the eighth grade. Appropriate statistical tests will be applied to standardized test scores to see if Aldrin students are 'better' than School B students in areas such as: Overall academic performance; Performance in math/science; and Enrollments in math/science in middle school.
Robust regression for large-scale neuroimaging studies.
Fritsch, Virgile; Da Mota, Benoit; Loth, Eva; Varoquaux, Gaël; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Brühl, Rüdiger; Butzek, Brigitte; Conrod, Patricia; Flor, Herta; Garavan, Hugh; Lemaitre, Hervé; Mann, Karl; Nees, Frauke; Paus, Tomas; Schad, Daniel J; Schümann, Gunter; Frouin, Vincent; Poline, Jean-Baptiste; Thirion, Bertrand
2015-05-01
Multi-subject datasets used in neuroimaging group studies have a complex structure, as they exhibit non-stationary statistical properties across regions and display various artifacts. While studies with small sample sizes can rarely be shown to deviate from standard hypotheses (such as the normality of the residuals) due to the poor sensitivity of normality tests with low degrees of freedom, large-scale studies (e.g. >100 subjects) exhibit more obvious deviations from these hypotheses and call for more refined models for statistical inference. Here, we demonstrate the benefits of robust regression as a tool for analyzing large neuroimaging cohorts. First, we use an analytic test based on robust parameter estimates; based on simulations, this procedure is shown to provide an accurate statistical control without resorting to permutations. Second, we show that robust regression yields more detections than standard algorithms using as an example an imaging genetics study with 392 subjects. Third, we show that robust regression can avoid false positives in a large-scale analysis of brain-behavior relationships with over 1500 subjects. Finally we embed robust regression in the Randomized Parcellation Based Inference (RPBI) method and demonstrate that this combination further improves the sensitivity of tests carried out across the whole brain. Altogether, our results show that robust procedures provide important advantages in large-scale neuroimaging group studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Jung, Bo Kyeung; Kim, Jeeyong; Cho, Chi Hyun; Kim, Ju Yeon; Nam, Myung Hyun; Shin, Bong Kyung; Rho, Eun Youn; Kim, Sollip; Sung, Heungsup; Kim, Shinyoung; Ki, Chang Seok; Park, Min Jung; Lee, Kap No; Yoon, Soo Young
2017-04-01
The National Health Information Standards Committee was established in 2004 in Korea. The practical subcommittee for laboratory test terminology was placed in charge of standardizing laboratory medicine terminology in Korean. We aimed to establish a standardized Korean laboratory terminology database, Korea-Logical Observation Identifier Names and Codes (K-LOINC) based on former products sponsored by this committee. The primary product was revised based on the opinions of specialists. Next, we mapped the electronic data interchange (EDI) codes that were revised in 2014, to the corresponding K-LOINC. We established a database of synonyms, including the laboratory codes of three reference laboratories and four tertiary hospitals in Korea. Furthermore, we supplemented the clinical microbiology section of K-LOINC using an alternative mapping strategy. We investigated other systems that utilize laboratory codes in order to investigate the compatibility of K-LOINC with statistical standards for a number of tests. A total of 48,990 laboratory codes were adopted (21,539 new and 16,330 revised). All of the LOINC synonyms were translated into Korean, and 39,347 Korean synonyms were added. Moreover, 21,773 synonyms were added from reference laboratories and tertiary hospitals. Alternative strategies were established for mapping within the microbiology domain. When we applied these to a smaller hospital, the mapping rate was successfully increased. Finally, we confirmed K-LOINC compatibility with other statistical standards, including a newly proposed EDI code system. This project successfully established an up-to-date standardized Korean laboratory terminology database, as well as an updated EDI mapping to facilitate the introduction of standard terminology into institutions. © 2017 The Korean Academy of Medical Sciences.
NASA Astrophysics Data System (ADS)
Leonardi, Marcelo
The primary purpose of this study was to examine the impact of a scheduling change from a trimester 4x4 block schedule to a modified hybrid schedule on student achievement in ninth grade biology courses. This study examined the impact of the scheduling change on student achievement through teacher created benchmark assessments in Genetics, DNA, and Evolution and on the California Standardized Test in Biology. The secondary purpose of this study examined the ninth grade biology teacher perceptions of ninth grade biology student achievement. Using a mixed methods research approach, data was collected both quantitatively and qualitatively as aligned to research questions. Quantitative methods included gathering data from departmental benchmark exams and California Standardized Test in Biology and conducting multiple analysis of covariance and analysis of covariance to determine significance differences. Qualitative methods include journal entries questions and focus group interviews. The results revealed a statistically significant increase in scores on both the DNA and Evolution benchmark exams. DNA and Evolution benchmark exams showed significant improvements from a change in scheduling format. The scheduling change was responsible for 1.5% of the increase in DNA benchmark scores and 2% of the increase in Evolution benchmark scores. The results revealed a statistically significant decrease in scores on the Genetics Benchmark exam as a result of the scheduling change. The scheduling change was responsible for 1% of the decrease in Genetics benchmark scores. The results also revealed a statistically significant increase in scores on the CST Biology exam. The scheduling change was responsible for .7% of the increase in CST Biology scores. Results of the focus group discussions indicated that all teachers preferred the modified hybrid schedule over the trimester schedule and that it improved student achievement.
The European Standard Series and its additions: are they of any use in 2013?
Castelain, Michel; Assier, Haudrey; Baeck, Marie; Bara, Corina; Barbaud, Annick; Castelain, Florence; Felix, Brigitte; Ferrie Le Bouedec, Marie Christine; Frick, Christian; Girardin, Pascal; Jacobs, Marie Claude; Jelen, Gilbert; Lartigaud, Isabelle; Raison-Peyron, Nadia; Tennstedt, Dominique; Tetard, Florence; Vigan, Martine; Waton, Julie
2014-01-01
This study has two purposes:--to know whether the European standard series is still the key reference when it comes to contact dermatitis, i.e., are its components still the most frequently involved allergens in contact dermatitis nowadays?--to assess the results of the European standard series among French and Belgian dermatologists/allergists as, so far, most of them have failed to provide statistical data within the European community of allergists/dermatologists. 18 participants from 2 dermatology and allergy centres in Belgium and 11 centres in France collected their results from 3,073 patients tested in 2011. They assessed the relevance of some tests as well as that of the standard series and additional series to establish an etiological diagnosis of contact dermatitis. These results, together with the history of the European standard series, have shown that some allergens are obsolete and that others should be included in a new standard series for which we are making a few suggestions.
Gaibazzi, Nicola; Petrucci, Nicola; Ziacchi, Vigilio
2004-03-01
Previous work showed a strong inverse association between 1-min heart rate recovery (HRR) after exercising on a treadmill and all-cause mortality. The aim of this study was to determine whether the results could be replicated in a wide population of real-world exercise ECG candidates in our center, using a standard bicycle exercise test. Between 1991 and 1997, 1420 consecutive patients underwent ECG exercise testing performed according to our standard cycloergometer protocol. Three pre-specified cut-point values of 1-min HRR, derived from previous studies in the medical literature, were tested to see whether they could identify a higher-risk group for all-cause mortality; furthermore, we tested the possible association between 1-min HRR as a continuous variable and mortality using logistic regression. Both methods showed a lack of a statistically significant association between 1-min HRR and all-cause mortality. A weak trend toward an inverse association, although not statistically significant, could not be excluded. We could not validate the clear-cut results from some previous studies performed using the treadmill exercise test. The results in our study may only "not exclude" a mild inverse association between 1-min HRR measured after cycloergometer exercise testing and all-cause mortality. The 1-min HRR measured after cycloergometer exercise testing was not clinically useful as a prognostic marker.
The effect of using graphic organizers in the teaching of standard biology
NASA Astrophysics Data System (ADS)
Pepper, Wade Louis, Jr.
This study was conducted to determine if the use of graphic organizers in the teaching of standard biology would increase student achievement, involvement and quality of activities. The subjects were 10th grade standard biology students in a large southern inner city high school. The study was conducted over a six-week period in an instructional setting using action research as the investigative format. After calculation of the homogeneity between classes, random selection was used to determine the graphic organizer class and the control class. The graphic organizer class was taught unit material through a variety of instructional methods along with the use of teacher generated graphic organizers. The control class was taught the same unit material using the same instructional methods, but without the use of graphic organizers. Data for the study were gathered from in-class written assignments, teacher-generated tests and text-generated tests, and rubric scores of an out-of-class written assignment and project. Also, data were gathered from student reactions, comments, observations and a teacher's research journal. Results were analyzed using descriptive statistics and qualitative interpretation. By comparing statistical results, it was determined that the use of graphic organizers did not make a statistically significant difference in the understanding of biological concepts and retention of factual information. Furthermore, the use of graphic organizers did not make a significant difference in motivating students to fulfill all class assignments with quality efforts and products. However, based upon student reactions and comments along with observations by the researcher, graphic organizers were viewed by the students as a favorable and helpful instructional tool. In lieu of statistical results, student gains from instructional activities using graphic organizers were positive and merit the continuation of their use as an instructional tool.
Statistics for Radiology Research.
Obuchowski, Nancy A; Subhas, Naveen; Polster, Joshua
2017-02-01
Biostatistics is an essential component in most original research studies in imaging. In this article we discuss five key statistical concepts for study design and analyses in modern imaging research: statistical hypothesis testing, particularly focusing on noninferiority studies; imaging outcomes especially when there is no reference standard; dealing with the multiplicity problem without spending all your study power; relevance of confidence intervals in reporting and interpreting study results; and finally tools for assessing quantitative imaging biomarkers. These concepts are presented first as examples of conversations between investigator and biostatistician, and then more detailed discussions of the statistical concepts follow. Three skeletal radiology examples are used to illustrate the concepts. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Insights from analysis for harmful and potentially harmful constituents (HPHCs) in tobacco products.
Oldham, Michael J; DeSoi, Darren J; Rimmer, Lonnie T; Wagner, Karl A; Morton, Michael J
2014-10-01
A total of 20 commercial cigarette and 16 commercial smokeless tobacco products were assayed for 96 compounds listed as harmful and potentially harmful constituents (HPHCs) by the US Food and Drug Administration. For each product, a single lot was used for all testing. Both International Organization for Standardization and Health Canada smoking regimens were used for cigarette testing. For those HPHCs detected, measured levels were consistent with levels reported in the literature, however substantial assay variability (measured as average relative standard deviation) was found for most results. Using an abbreviated list of HPHCs, statistically significant differences for most of these HPHCs occurred when results were obtained 4-6months apart (i.e., temporal variability). The assay variability and temporal variability demonstrate the need for standardized analytical methods with defined repeatability and reproducibility for each HPHC using certified reference standards. Temporal variability also means that simple conventional comparisons, such as two-sample t-tests, are inappropriate for comparing products tested at different points in time from the same laboratory or from different laboratories. Until capable laboratories use standardized assays with established repeatability, reproducibility, and certified reference standards, the resulting HPHC data will be unreliable for product comparisons or other decision making in regulatory science. Copyright © 2014 Elsevier Inc. All rights reserved.
40 CFR 1065.1005 - Symbols, abbreviations, acronyms, and units of measure.
Code of Federal Regulations, 2012 CFR
2012-07-01
... least squares regression β ratio of diameters meter per meter m/m 1 β atomic oxygen to carbon ratio mole... consumption gram per kilowatt hour g/(kW·hr) g·3.6−1·106·m−2·kg·s2 F F-test statistic f frequency hertz Hz s−1... standard deviation S Sutherland constant kelvin K K SEE standard estimate of error T absolute temperature...
40 CFR 1065.1005 - Symbols, abbreviations, acronyms, and units of measure.
Code of Federal Regulations, 2013 CFR
2013-07-01
... least squares regression β ratio of diameters meter per meter m/m 1 β atomic oxygen to carbon ratio mole... consumption gram per kilowatt hour g/(kW·hr) g·3.6−1·106·m−2·kg·s2 F F-test statistic f frequency hertz Hz s−1... standard deviation S Sutherland constant kelvin K K SEE standard estimate of error T absolute temperature...
Academic Outcome Measures of a Dedicated Education Unit Over Time: Help or Hinder?
Smyer, Tish; Gatlin, Tricia; Tan, Rhigel; Tejada, Marianne; Feng, Du
2015-01-01
Critical thinking, nursing process, quality and safety measures, and standardized RN exit examination scores were compared between students (n = 144) placed in a dedicated education unit (DEU) and those in a traditional clinical model. Standardized test scores showed that differences between the clinical groups were not statistically significant. This study shows that the DEU model is 1 approach to clinical education that can enhance students' academic outcomes.
Chen, Xiang-Wu; Zhao, Ying-Xi
2017-01-01
AIM To compare the diagnostic performance of isolated-check visual evoked potential (icVEP) and standard automated perimetry (SAP), for evaluating the application values of icVEP in the detection of early glaucoma. METHODS Totally 144 subjects (288 eyes) were enrolled in this study. icVEP testing was performed with the Neucodia visual electrophysiological diagnostic system. A 15% positive-contrast (bright) condition pattern was used in this device to differentiate between glaucoma patients and healthy control subjects. Signal-to-noise ratios (SNR) were derived based on a multivariate statistic. The eyes were judged as abnormal if the test yielded an SNR≤1. SAP testing was performed with the Humphrey Field Analyzer II. The visual fields were deemed as abnormality if the glaucoma hemifield test results outside normal limits; or the pattern standard deviation with P<0.05; or the cluster of three or more non-edge points on the pattern deviation plot in a single hemifield with P<0.05, one of which must have a P<0.01. Disc photographs were graded as either glaucomatous optic neuropathy or normal by two experts who were masked to all other patient information. Moorfields regression analysis (MRA) used as a separate diagnostic classification was performed by Heidelberg retina tomograph (HRT). RESULTS When the disc photograph grader was used as diagnostic standard, the sensitivity for SAP and icVEP was 32.3% and 38.5% respectively and specificity was 82.3% and 77.8% respectively. When the MRA Classifier was used as the diagnostic standard, the sensitivity for SAP and icVEP was 48.6% and 51.4% respectively and specificity was 84.1% and 78.0% respectively. When the combined structural assessment was used as the diagnostic standard, the sensitivity for SAP and icVEP was 59.2% and 53.1% respectively and specificity was 84.2% and 84.6% respectivlely. There was no statistical significance between the sensitivity or specificity of SAP and icVEP, regardless of which diagnostic standard was based on. CONCLUSION The diagnostic performance of icVEP is not better than that of SAP in the detection of early glaucoma. PMID:28503434
Higher certainty of the laser-induced damage threshold test with a redistributing data treatment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jensen, Lars; Mrohs, Marius; Gyamfi, Mark
2015-10-15
As a consequence of its statistical nature, the measurement of the laser-induced damage threshold holds always risks to over- or underestimate the real threshold value. As one of the established measurement procedures, the results of S-on-1 (and 1-on-1) tests outlined in the corresponding ISO standard 21 254 depend on the amount of data points and their distribution over the fluence scale. With the limited space on a test sample as well as the requirements on test site separation and beam sizes, the amount of data from one test is restricted. This paper reports on a way to treat damage testmore » data in order to reduce the statistical error and therefore measurement uncertainty. Three simple assumptions allow for the assignment of one data point to multiple data bins and therefore virtually increase the available data base.« less
Furlan, Leonardo; Sterr, Annette
2018-01-01
Motor learning studies face the challenge of differentiating between real changes in performance and random measurement error. While the traditional p -value-based analyses of difference (e.g., t -tests, ANOVAs) provide information on the statistical significance of a reported change in performance scores, they do not inform as to the likely cause or origin of that change, that is, the contribution of both real modifications in performance and random measurement error to the reported change. One way of differentiating between real change and random measurement error is through the utilization of the statistics of standard error of measurement (SEM) and minimal detectable change (MDC). SEM is estimated from the standard deviation of a sample of scores at baseline and a test-retest reliability index of the measurement instrument or test employed. MDC, in turn, is estimated from SEM and a degree of confidence, usually 95%. The MDC value might be regarded as the minimum amount of change that needs to be observed for it to be considered a real change, or a change to which the contribution of real modifications in performance is likely to be greater than that of random measurement error. A computer-based motor task was designed to illustrate the applicability of SEM and MDC to motor learning research. Two studies were conducted with healthy participants. Study 1 assessed the test-retest reliability of the task and Study 2 consisted in a typical motor learning study, where participants practiced the task for five consecutive days. In Study 2, the data were analyzed with a traditional p -value-based analysis of difference (ANOVA) and also with SEM and MDC. The findings showed good test-retest reliability for the task and that the p -value-based analysis alone identified statistically significant improvements in performance over time even when the observed changes could in fact have been smaller than the MDC and thereby caused mostly by random measurement error, as opposed to by learning. We suggest therefore that motor learning studies could complement their p -value-based analyses of difference with statistics such as SEM and MDC in order to inform as to the likely cause or origin of any reported changes in performance.
ERIC Educational Resources Information Center
Juan, Wu Xiao; Abidin, Mohamad Jafre Zainol; Eng, Lin Siew
2013-01-01
This survey aims at studying the relationship between English vocabulary threshold and word guessing strategy that is used in reading comprehension learning among 80 pre-university Chinese students in Malaysia. T-test is the main statistical test for this research, and the collected data is analysed using SPSS. From the standard deviation test…
JAN transistor and diode characterization test program, JANTX diode 1N5623
NASA Technical Reports Server (NTRS)
Takeda, H.
1977-01-01
A statistical summary of the electrical characterization of diodes and transistors is presented. Each parameter is presented with test conditions, mean, standard deviation, lowest reading, 10% point (where 10% of all readings are equal to or less than the indicated reading), 90% point (where 90% of all readings are equal to or less than indicated reading) and the highest reading.
ERIC Educational Resources Information Center
Paek, Insu
2010-01-01
Conservative bias in rejection of a null hypothesis from using the continuity correction in the Mantel-Haenszel (MH) procedure was examined through simulation in a differential item functioning (DIF) investigation context in which statistical testing uses a prespecified level [alpha] for the decision on an item with respect to DIF. The standard MH…
Validation of Physics Standardized Test Items
NASA Astrophysics Data System (ADS)
Marshall, Jill
2008-10-01
The Texas Physics Assessment Team (TPAT) examined the Texas Assessment of Knowledge and Skills (TAKS) to determine whether it is a valid indicator of physics preparation for future course work and employment, and of the knowledge and skills needed to act as an informed citizen in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam using standard statistical methods employed by test developers (factor analysis and Item Response Theory). Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing..
NASA Astrophysics Data System (ADS)
Clifford, Betsey A.
The Massachusetts Department of Elementary and Secondary Education (DESE) released proposed Science and Technology/Engineering standards in 2013 outlining the concepts that should be taught at each grade level. Previously, standards were in grade spans and each district determined the method of implementation. There are two different methods used teaching middle school science: integrated and discipline-based. In the proposed standards, the Massachusetts DESE uses grade-by-grade standards using an integrated approach. It was not known if there is a statistically significant difference in student achievement on the 8th grade science MCAS assessment for students taught with an integrated or discipline-based approach. The results on the 8th grade science MCAS test from six public school districts from 2010 -- 2013 were collected and analyzed. The methodology used was quantitative. Results of an ANOVA showed that there was no statistically significant difference in overall student achievement between the two curriculum models. Furthermore, there was no statistically significant difference for the various domains: Earth and Space Science, Life Science, Physical Science, and Technology/Engineering. This information is useful for districts hesitant to make the change from a discipline-based approach to an integrated approach. More research should be conducted on this topic with a larger sample size to better support the results.
Can Scientifically Useful Hypotheses Be Tested with Correlations?
ERIC Educational Resources Information Center
Bentler, Peter M.
2007-01-01
Historically, interesting psychological theories have been phrased in terms of correlation coefficients, which are standardized covariances, and various statistics derived from them. Methodological practice over the last 40 years, however, has suggested it is necessary to transform such theories into hypotheses on covariances and statistics…
Corron, Louise; Marchal, François; Condemi, Silvana; Chaumoître, Kathia; Adalian, Pascal
2017-01-01
Juvenile age estimation methods used in forensic anthropology generally lack methodological consistency and/or statistical validity. Considering this, a standard approach using nonparametric Multivariate Adaptive Regression Splines (MARS) models were tested to predict age from iliac biometric variables of male and female juveniles from Marseilles, France, aged 0-12 years. Models using unidimensional (length and width) and bidimensional iliac data (module and surface) were constructed on a training sample of 176 individuals and validated on an independent test sample of 68 individuals. Results show that MARS prediction models using iliac width, module and area give overall better and statistically valid age estimates. These models integrate punctual nonlinearities of the relationship between age and osteometric variables. By constructing valid prediction intervals whose size increases with age, MARS models take into account the normal increase of individual variability. MARS models can qualify as a practical and standardized approach for juvenile age estimation. © 2016 American Academy of Forensic Sciences.
Feiler, Ute; Ratte, Monika; Arts, Gertie; Bazin, Christine; Brauer, Frank; Casado, Carmen; Dören, Laszlo; Eklund, Britta; Gilberg, Daniel; Grote, Matthias; Gonsior, Guido; Hafner, Christoph; Kopf, Willi; Lemnitzer, Bernd; Liedtke, Anja; Matthias, Uwe; Okos, Ewa; Pandard, Pascal; Scheerbaum, Dirk; Schmitt-Jansen, Mechthild; Stewart, Kathleen; Teodorovic, Ivana; Wenzel, Andrea; Pluta, Hans-Jürgen
2014-03-01
A whole-sediment toxicity test with Myriophyllum aquaticum has been developed by the German Federal Institute of Hydrology and standardized within the International Organization for Standardization (ISO; ISO 16191). An international ring-test was performed to evaluate the precision of the test method. Four sediments (artificial, natural) were tested. Test duration was 10 d, and test endpoint was inhibition of growth rate (r) based on fresh weight data. Eighteen of 21 laboratories met the validity criterion of r ≥ 0.09 d(-1) in the control. Results from 4 tests that did not conform to test-performance criteria were excluded from statistical evaluation. The inter-laboratory variability of growth rates (20.6%-25.0%) and inhibition (26.6%-39.9%) was comparable with the variability of other standardized bioassays. The mean test-internal variability of the controls was low (7% [control], 9.7% [solvent control]), yielding a high discriminatory power of the given test design (median minimum detectable differences [MDD] 13% to 15%). To ensure these MDDs, an additional validity criterion of CV ≤ 15% of the growth rate in the controls was recommended. As a positive control, 90 mg 3,5-dichlorophenol/kg sediment dry mass was tested. The range of the expected growth inhibition was proposed to be 35 ± 15%. The ring test results demonstrated the reliability of the ISO 16191 toxicity test and its suitability as a tool to assess the toxicity of sediment and dredged material. © 2013 SETAC.
Sequi, Marco; Campi, Rita; Clavenna, Antonio; Bonati, Maurizio
2013-03-01
To evaluate the quality of data reporting and statistical methods performed in drug utilization studies in the pediatric population. Drug utilization studies evaluating all drug prescriptions to children and adolescents published between January 1994 and December 2011 were retrieved and analyzed. For each study, information on measures of exposure/consumption, the covariates considered, descriptive and inferential analyses, statistical tests, and methods of data reporting was extracted. An overall quality score was created for each study using a 12-item checklist that took into account the presence of outcome measures, covariates of measures, descriptive measures, statistical tests, and graphical representation. A total of 22 studies were reviewed and analyzed. Of these, 20 studies reported at least one descriptive measure. The mean was the most commonly used measure (18 studies), but only five of these also reported the standard deviation. Statistical analyses were performed in 12 studies, with the chi-square test being the most commonly performed test. Graphs were presented in 14 papers. Sixteen papers reported the number of drug prescriptions and/or packages, and ten reported the prevalence of the drug prescription. The mean quality score was 8 (median 9). Only seven of the 22 studies received a score of ≥10, while four studies received a score of <6. Our findings document that only a few of the studies reviewed applied statistical methods and reported data in a satisfactory manner. We therefore conclude that the methodology of drug utilization studies needs to be improved.
Vlieg-Boerstra, Berber J; Bijleveld, Charles M A; van der Heide, Sicco; Beusekamp, Berta J; Wolt-Plompen, Saskia A A; Kukler, Jeanet; Brinkman, Joep; Duiverman, Eric J; Dubois, Anthony E J
2004-02-01
The use of double-blind, placebo-controlled food challenges (DBPCFCs) is considered the gold standard for the diagnosis of food allergy. Despite this, materials and methods used in DBPCFCs have not been standardized. The purpose of this study was to develop and validate recipes for use in DBPCFCs in children by using allergenic foods, preferably in their usual edible form. Recipes containing milk, soy, cooked egg, raw whole egg, peanut, hazelnut, and wheat were developed. For each food, placebo and active test food recipes were developed that met the requirements of acceptable taste, allowance of a challenge dose high enough to elicit reactions in an acceptable volume, optimal matrix ingredients, and good matching of sensory properties of placebo and active test food recipes. Validation was conducted on the basis of sensory tests for difference by using the triangle test and the paired comparison test. Recipes were first tested by volunteers from the hospital staff and subsequently by a professional panel of food tasters in a food laboratory designed for sensory testing. Recipes were considered to be validated if no statistically significant differences were found. Twenty-seven recipes were developed and found to be valid by the volunteer panel. Of these 27 recipes, 17 could be validated by the professional panel. Sensory testing with appropriate statistical analysis allows for objective validation of challenge materials. We recommend the use of professional tasters in the setting of a food laboratory for best results.
Bayesian modelling of lung function data from multiple-breath washout tests.
Mahar, Robert K; Carlin, John B; Ranganathan, Sarath; Ponsonby, Anne-Louise; Vuillermin, Peter; Vukcevic, Damjan
2018-05-30
Paediatric respiratory researchers have widely adopted the multiple-breath washout (MBW) test because it allows assessment of lung function in unsedated infants and is well suited to longitudinal studies of lung development and disease. However, a substantial proportion of MBW tests in infants fail current acceptability criteria. We hypothesised that a model-based approach to analysing the data, in place of traditional simple empirical summaries, would enable more efficient use of these tests. We therefore developed a novel statistical model for infant MBW data and applied it to 1197 tests from 432 individuals from a large birth cohort study. We focus on Bayesian estimation of the lung clearance index, the most commonly used summary of lung function from MBW tests. Our results show that the model provides an excellent fit to the data and shed further light on statistical properties of the standard empirical approach. Furthermore, the modelling approach enables the lung clearance index to be estimated by using tests with different degrees of completeness, something not possible with the standard approach. Our model therefore allows previously unused data to be used rather than discarded, as well as routine use of shorter tests without significant loss of precision. Beyond our specific application, our work illustrates a number of important aspects of Bayesian modelling in practice, such as the importance of hierarchical specifications to account for repeated measurements and the value of model checking via posterior predictive distributions. Copyright © 2018 John Wiley & Sons, Ltd.
Revised standards for statistical evidence.
Johnson, Valen E
2013-11-26
Recent advances in Bayesian hypothesis testing have led to the development of uniformly most powerful Bayesian tests, which represent an objective, default class of Bayesian hypothesis tests that have the same rejection regions as classical significance tests. Based on the correspondence between these two classes of tests, it is possible to equate the size of classical hypothesis tests with evidence thresholds in Bayesian tests, and to equate P values with Bayes factors. An examination of these connections suggest that recent concerns over the lack of reproducibility of scientific studies can be attributed largely to the conduct of significance tests at unjustifiably high levels of significance. To correct this problem, evidence thresholds required for the declaration of a significant finding should be increased to 25-50:1, and to 100-200:1 for the declaration of a highly significant finding. In terms of classical hypothesis tests, these evidence standards mandate the conduct of tests at the 0.005 or 0.001 level of significance.
Aminopenicillin-associated exanthem: lymphocyte transformation testing revisited.
Trautmann, A; Seitz, C S; Stoevesandt, J; Kerstan, A
2014-12-01
The lymphocyte transformation test (LTT) has been promoted as in-vitro test for diagnosis of drug hypersensitivity. For determination of statistical LTT sensitivity, series of patients with clinically uniform reactions followed by complete drug hypersensitivity work-up are mandatory. Assessment of LTT specificity requires control patients who tolerated exposure to the drug studied. To prospectively determine the diagnostic value of the LTT in a clinically and diagnostically well-defined series of patients. Patients with exanthematous skin eruptions after ampicillin (AMP) intake were included in this study. After exclusion or confirmation of delayed-onset allergic AMP hypersensitivity by skin and provocation testing, two independent LTTs were performed: one standard LTT and a modified LTT with additional anti-CD3/anti-CD28 monoclonal antibody stimulation. By testing, delayed-onset allergic AMP hypersensitivity was diagnosed in 11 patients and definitely ruled out in 26. The standard LTT reached a diagnostic sensitivity of 54.5% while the modified LTT yielded 72.7%. However, the methodical test modification resulted in a decline of specificity from 92.3% (standard LTT) to 76.9%. In cases of AMP-associated exanthems, the diagnostic value of the LTT compared with routine allergy testing is limited. When evaluating such exanthems, provocation testing remains the gold standard. Delayed reading of intradermal skin tests remains most useful to avoid positive provocation reactions. © 2014 John Wiley & Sons Ltd.
Fully Bayesian tests of neutrality using genealogical summary statistics.
Drummond, Alexei J; Suchard, Marc A
2008-10-31
Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome. Here we harness posterior predictive simulation to exploit summary statistics of both the data and model parameters to test the goodness-of-fit of standard models of evolution. We apply the method to test the selective neutrality of molecular evolution in non-recombining gene genealogies and we demonstrate the utility of our method on four real data sets, identifying significant departures of neutrality in human influenza A virus, even after controlling for variation in population size. Importantly, by employing a full model-based Bayesian analysis, our method separates the effects of demography from the effects of selection. The method also allows multiple summary statistics to be used in concert, thus potentially increasing sensitivity. Furthermore, our method remains useful in situations where analytical expectations and variances of summary statistics are not available. This aspect has great potential for the analysis of temporally spaced data, an expanding area previously ignored for limited availability of theory and methods.
Visualizing statistical significance of disease clusters using cartograms.
Kronenfeld, Barry J; Wong, David W S
2017-05-15
Health officials and epidemiological researchers often use maps of disease rates to identify potential disease clusters. Because these maps exaggerate the prominence of low-density districts and hide potential clusters in urban (high-density) areas, many researchers have used density-equalizing maps (cartograms) as a basis for epidemiological mapping. However, we do not have existing guidelines for visual assessment of statistical uncertainty. To address this shortcoming, we develop techniques for visual determination of statistical significance of clusters spanning one or more districts on a cartogram. We developed the techniques within a geovisual analytics framework that does not rely on automated significance testing, and can therefore facilitate visual analysis to detect clusters that automated techniques might miss. On a cartogram of the at-risk population, the statistical significance of a disease cluster is determinate from the rate, area and shape of the cluster under standard hypothesis testing scenarios. We develop formulae to determine, for a given rate, the area required for statistical significance of a priori and a posteriori designated regions under certain test assumptions. Uniquely, our approach enables dynamic inference of aggregate regions formed by combining individual districts. The method is implemented in interactive tools that provide choropleth mapping, automated legend construction and dynamic search tools to facilitate cluster detection and assessment of the validity of tested assumptions. A case study of leukemia incidence analysis in California demonstrates the ability to visually distinguish between statistically significant and insignificant regions. The proposed geovisual analytics approach enables intuitive visual assessment of statistical significance of arbitrarily defined regions on a cartogram. Our research prompts a broader discussion of the role of geovisual exploratory analyses in disease mapping and the appropriate framework for visually assessing the statistical significance of spatial clusters.
Valid statistical inference methods for a case-control study with missing data.
Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun
2018-04-01
The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.
Innovative approach to teaching communication skills to nursing students.
Zavertnik, Jean Ellen; Huff, Tanya A; Munro, Cindy L
2010-02-01
This study assessed the effectiveness of a learner-centered simulation intervention designed to improve the communication skills of preprofessional sophomore nursing students. An innovative teaching strategy in which communication skills are taught to nursing students by using trained actors who served as standardized family members in a clinical learning laboratory setting was evaluated using a two-group posttest design. In addition to current standard education, the intervention group received a formal training session presenting a framework for communication and a 60-minute practice session with the standardized family members. Four domains of communication-introduction, gathering of information, imparting information, and clarifying goals and expectations-were evaluated in the control and intervention groups in individual testing sessions with a standardized family member. The intervention group performed better than the control group in all four tested domains related to communication skills, and the difference was statistically significant in the domain of gathering information (p = 0.0257). Copyright 2010, SLACK Incorporated.
An examination of the challenges influencing science instruction in Florida elementary classrooms
NASA Astrophysics Data System (ADS)
North, Stephanie Gwinn
It has been shown that the mechanical properties of thin films tend to differ from their bulk counterparts. Specifically, the bulge and microtensile testing of thin films used in MEMS have revealed that these films demonstrate an inverse relationship between thickness and strength. A film dimension is not a material property, but it evidently does affect the mechanical performance of materials at very small thicknesses. A hypothetical explanation for this phenomenon is that as the thickness dimension of the film decreases, it is statistically less likely that imperfections exist in the material. It would require a very small thickness (or volume) to limit imperfections in a material, which is why this phenomenon is seen in films with thicknesses on the order of 100 nm to a few microns. Another hypothesized explanation is that the surface tension that exists in bulk material also exists in thin films but has a greater impact at such a small scale. The goal of this research is to identify a theoretical prediction of the strength of thin films based on its microstructural properties such as grain size and film thickness. This would minimize the need for expensive and complicated tests such as the bulge and microtensile tests. In this research, data was collected from the bulge and microtensile testing of copper, aluminum, gold, and polysilicon free-standing thin films. Statistical testing of this data revealed a definitive inverse relationship between thickness and strength, as well as between grain size and strength, as expected. However, due to a lack of a standardized method for either test, there were significant variations in the data. This research compares and analyzes the methods used by other researchers to develop a suggested set of instructions for a standardized bulge test and standardized microtensile test. The most important parameters to be controlled in each test were found to be strain rate, temperature, film deposition method, film length, and strain measurement.
Robust Mean and Covariance Structure Analysis through Iteratively Reweighted Least Squares.
ERIC Educational Resources Information Center
Yuan, Ke-Hai; Bentler, Peter M.
2000-01-01
Adapts robust schemes to mean and covariance structures, providing an iteratively reweighted least squares approach to robust structural equation modeling. Each case is weighted according to its distance, based on first and second order moments. Test statistics and standard error estimators are given. (SLD)
The Real World Significance of Performance Prediction
ERIC Educational Resources Information Center
Pardos, Zachary A.; Wang, Qing Yang; Trivedi, Shubhendu
2012-01-01
In recent years, the educational data mining and user modeling communities have been aggressively introducing models for predicting student performance on external measures such as standardized tests as well as within-tutor performance. While these models have brought statistically reliable improvement to performance prediction, the real world…
Modeling the Test-Retest Statistics of a Localization Experiment in the Full Horizontal Plane.
Morsnowski, André; Maune, Steffen
2016-10-01
Two approaches to model the test-retest statistics of a localization experiment basing on Gaussian distribution and on surrogate data are introduced. Their efficiency is investigated using different measures describing directional hearing ability. A localization experiment in the full horizontal plane is a challenging task for hearing impaired patients. In clinical routine, we use this experiment to evaluate the progress of our cochlear implant (CI) recipients. Listening and time effort limit the reproducibility. The localization experiment consists of a 12 loudspeaker circle, placed in an anechoic room, a "camera silens". In darkness, HSM sentences are presented at 65 dB pseudo-erratically from all 12 directions with five repetitions. This experiment is modeled by a set of Gaussian distributions with different standard deviations added to a perfect estimator, as well as by surrogate data. Five repetitions per direction are used to produce surrogate data distributions for the sensation directions. To investigate the statistics, we retrospectively use the data of 33 CI patients with 92 pairs of test-retest-measurements from the same day. The first model does not take inversions into account, (i.e., permutations of the direction from back to front and vice versa are not considered), although they are common for hearing impaired persons particularly in the rear hemisphere. The second model considers these inversions but does not work with all measures. The introduced models successfully describe test-retest statistics of directional hearing. However, since their applications on the investigated measures perform differently no general recommendation can be provided. The presented test-retest statistics enable pair test comparisons for localization experiments.
NASA Technical Reports Server (NTRS)
Ziff, Howard L; Rathert, George A; Gadeberg, Burnett L
1953-01-01
Standard air-to-air-gunnery tracking runs were conducted with F-51H, F8F-1, F-86A, and F-86E airplanes equipped with fixed gunsights. The tracking performances were documented over the normal operating range of altitude, Mach number, and normal acceleration factor for each airplane. The sources of error were studied by statistical analyses of the aim wander.
NASA Astrophysics Data System (ADS)
Maffucci, Irene; Hu, Xiao; Fumagalli, Valentina; Contini, Alessandro
2018-03-01
Nwat-MMGBSA is a variant of MM-PB/GBSA based on the inclusion of a number of explicit water molecules that are the closest to the ligand in each frame of a molecular dynamics trajectory. This method demonstrated improved correlations between calculated and experimental binding energies in both protein-protein interactions and ligand-receptor complexes, in comparison to the standard MM-GBSA. A protocol optimization, aimed to maximize efficacy and efficiency, is discussed here considering penicillopepsin, HIV1-protease, and BCL-XL as test cases. Calculations were performed in triplicates on both classic HPC environments and on standard workstations equipped by a GPU card, evidencing no statistical differences in the results. No relevant differences in correlation to experiments were also observed when performing Nwat-MMGBSA calculations on 4 ns or 1 ns long trajectories. A fully automatic workflow for structure-based virtual screening, performing from library set-up to docking and Nwat-MMGBSA rescoring, has then been developed. The protocol has been tested against no rescoring or standard MM-GBSA rescoring within a retrospective virtual screening of inhibitors of AmpC β-lactamase and of the Rac1-Tiam1 protein-protein interaction. In both cases, Nwat-MMGBSA rescoring provided a statistically significant increase in the ROC AUCs of between 20% and 30%, compared to docking scoring or to standard MM-GBSA rescoring.
Effect of multizone refractive multifocal contact lenses on standard automated perimetry.
Madrid-Costa, David; Ruiz-Alcocer, Javier; García-Lázaro, Santiago; Albarrán-Diego, César; Ferrer-Blasco, Teresa
2012-09-01
The aim of this study was to evaluate whether the creation of 2 foci (distance and near) provided by multizone refractive multifocal contact lenses (CLs) for presbyopia correction affects the measurements on Humphreys 24-2 Swedish interactive threshold algorithm (SITA) standard automated perimetry (SAP). In this crossover study, 30 subjects were fitted in random order with either a multifocal CL or a monofocal CL. After 1 month, a Humphrey 24-2 SITA standard strategy was performed. The visual field global indices (the mean deviation [MD] and pattern standard deviation [PSD]), reliability indices, test duration, and number of depressed points deviating at P<5%, P<2%, P<1%, and P<0.5% on pattern deviation probability plots were determined and compared between multifocal and monofocal CLs. Thirty eyes of 30 subjects were included in this study. There were no statistically significant differences in reliability indices or test duration. There was a statistically significant reduction in the MD with the multifocal CL compared with monfocal CL (P=0.001). Differences were not found in PSD nor in the number of depressed points deviating at P<5%, P<2%, P<1%, and P<0.5% in the pattern deviation probability maps studied. The results of this study suggest that the multizone refractive lens produces a generalized depression in threshold sensitivity as measured by the Humphreys 24-2 SITA SAP.
Sabour, Siamak
2018-03-08
The purpose of this letter, in response to Hall, Mehta, and Fackrell (2017), is to provide important knowledge about methodology and statistical issues in assessing the reliability and validity of an audiologist-administered tinnitus loudness matching test and a patient-reported tinnitus loudness rating. The author uses reference textbooks and published articles regarding scientific assessment of the validity and reliability of a clinical test to discuss the statistical test and the methodological approach in assessing validity and reliability in clinical research. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess reliability and validity. The qualitative variables of sensitivity, specificity, positive predictive value, negative predictive value, false positive and false negative rates, likelihood ratio positive and likelihood ratio negative, as well as odds ratio (i.e., ratio of true to false results), are the most appropriate estimates to evaluate validity of a test compared to a gold standard. In the case of quantitative variables, depending on distribution of the variable, Pearson r or Spearman rho can be applied. Diagnostic accuracy (validity) and diagnostic precision (reliability or agreement) are two completely different methodological issues. Depending on the type of the variable (qualitative or quantitative), well-known statistical tests can be applied to assess validity.
A Policymaker's Primer on Testing and Assessment. Assessment Policy. Info Brief. Number 42
ERIC Educational Resources Information Center
Laitsch, Dan
2005-01-01
Standardized testing plays an increasingly important role in the lives of today's students and educators. The U.S. No Child Left Behind Act (NCLB) requires assessment in math and literacy in grades 3-8 and 10 and, as of 2007-08, in science once in grades 3-5, 6-9, and 10-12. Based on National Center for Education Statistics enrollment projections,…
United States Air Force Statistical Digest, Fiscal Year 1959. Fourteenth Edition
1959-09-30
Support Forces for fiscal year 1959 consist of Air Refueling; Strategic Support; Airborne Early Warning and Control; Radar Evaluation ;.He11cop~er...missions. (EI) Test: Aircraft assigned to evaluate the aircraft and/or its components installed as standard equipment. (EH) Test Support: Aircraft... consumables under two AF-GEN sub-projects: PROJECT SEAWEED and PROJECT NIGHT LIFE & FLYAWAY KITS. Also included in this section is data on ammunition
A quality assessment of randomized controlled trial reports in endodontics.
Lucena, C; Souza, E M; Voinea, G C; Pulgar, R; Valderrama, M J; De-Deus, G
2017-03-01
To assess the quality of the randomized clinical trial (RCT) reports published in Endodontics between 1997 and 2012. Retrieval of RCTs in Endodontics was based on a search of the Thomson Reuters Web of Science (WoS) database (March 2013). Quality evaluation was performed using a checklist based on the Jadad criteria, CONSORT (Consolidated Standards of Reporting Trials) statement and SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials). Descriptive statistics were used for frequency distribution of data. Student's t-test and Welch test were used to identify the influence of certain trial characteristics upon report quality (α = 0.05). A total of 89 RCTs were evaluated, and several methodological flaws were found: only 45% had random sequence generation at low risk of bias, 75% did not provide information on allocation concealment, and 19% were nonblinded designs. Regarding statistics, only 55% of the RCTs performed adequate sample size estimations, only 16% presented confidence intervals, and 25% did not provide the exact P-value. Also, 2% of the articles used no statistical tests, and in 87% of the RCTs, the information provided was insufficient to determine whether the statistical methodology applied was appropriate or not. Significantly higher scores were observed for multicentre trials (P = 0.023), RCTs signed by more than 5 authors (P = 0.03), articles belonging to journals ranked above the JCR median (P = 0.03), and articles complying with the CONSORT guidelines (P = 0.000). The quality of RCT reports in key areas for internal validity of the study was poor. Several measures, such as compliance with the CONSORT guidelines, are important in order to raise the quality of RCTs in Endodontics. © 2016 International Endodontic Journal. Published by John Wiley & Sons Ltd.
Hao, Chun; Huan, Xiping; Yan, Hongjing; Yang, Haitao; Guan, Wenhui; Xu, Xiaoqin; Zhang, Min; Wang, Na; Tang, Weiming; Gu, Jing; Lau, Joseph T F
2012-07-01
The randomized controlled trial investigated the relative efficacy of an enhanced (EVCT) versus standard (SVCT) voluntary counseling and testing in reducing unprotected anal intercourse (UAI) among men who have sex with men (MSM) in China. 295 participants who recruited by respondent driven sampling methods were randomly allocated to the two arms. In addition to the SVCT, the EVCT group watched a theory-based video narrated by a HIV positive MSM, received enhanced counseling and a reminder gift. As compared to the SVCT group, the EVCT group reported lower prevalence of UAI with any male sex partners (48.4% versus 66.7%, RR = 0.7, ARR = -18.3%, p = 0.010) and with regular male sex partners (52.2% versus 68.9%, RR = 0.8, ARR = -16.7%, p = 0.043) at Month 6, whilst baseline between-group differences were statistically non-significant. Between-group differences in HIV/syphilis incidence were statistically non-significant. Translational research should be conducted to integrate non-intensive enhancements such as the EVCT into regular testing services.
Cohn, T.A.; England, J.F.; Berenbrock, C.E.; Mason, R.R.; Stedinger, J.R.; Lamontagne, J.R.
2013-01-01
he Grubbs-Beck test is recommended by the federal guidelines for detection of low outliers in flood flow frequency computation in the United States. This paper presents a generalization of the Grubbs-Beck test for normal data (similar to the Rosner (1983) test; see also Spencer and McCuen (1996)) that can provide a consistent standard for identifying multiple potentially influential low flows. In cases where low outliers have been identified, they can be represented as “less-than” values, and a frequency distribution can be developed using censored-data statistical techniques, such as the Expected Moments Algorithm. This approach can improve the fit of the right-hand tail of a frequency distribution and provide protection from lack-of-fit due to unimportant but potentially influential low flows (PILFs) in a flood series, thus making the flood frequency analysis procedure more robust.
NASA Astrophysics Data System (ADS)
Sturrock, P. A.
2008-01-01
Using the chi-square statistic, one may conveniently test whether a series of measurements of a variable are consistent with a constant value. However, that test is predicated on the assumption that the appropriate probability distribution function (pdf) is normal in form. This requirement is usually not satisfied by experimental measurements of the solar neutrino flux. This article presents an extension of the chi-square procedure that is valid for any form of the pdf. This procedure is applied to the GALLEX-GNO dataset, and it is shown that the results are in good agreement with the results of Monte Carlo simulations. Whereas application of the standard chi-square test to symmetrized data yields evidence significant at the 1% level for variability of the solar neutrino flux, application of the extended chi-square test to the unsymmetrized data yields only weak evidence (significant at the 4% level) of variability.
NASA Astrophysics Data System (ADS)
Cohn, T. A.; England, J. F.; Berenbrock, C. E.; Mason, R. R.; Stedinger, J. R.; Lamontagne, J. R.
2013-08-01
The Grubbs-Beck test is recommended by the federal guidelines for detection of low outliers in flood flow frequency computation in the United States. This paper presents a generalization of the Grubbs-Beck test for normal data (similar to the Rosner (1983) test; see also Spencer and McCuen (1996)) that can provide a consistent standard for identifying multiple potentially influential low flows. In cases where low outliers have been identified, they can be represented as "less-than" values, and a frequency distribution can be developed using censored-data statistical techniques, such as the Expected Moments Algorithm. This approach can improve the fit of the right-hand tail of a frequency distribution and provide protection from lack-of-fit due to unimportant but potentially influential low flows (PILFs) in a flood series, thus making the flood frequency analysis procedure more robust.
A Vignette (User's Guide) for “An R Package for Statistical ...
StatCharrms is a graphical user front-end for ease of use in analyzing data generated from OCSPP 890.2200, Medaka Extended One Generation Reproduction Test (MEOGRT) and OCSPP 890.2300, Larval Amphibian Gonad Development Assay (LAGDA). The analyses StatCharrms is capable of performing are: Rao-Scott adjusted Cochran-Armitage test for trend By Slices (RSCABS), a Standard Cochran-Armitage test for trend By Slices (SCABS), mixed effects Cox proportional model, Jonckheere-Terpstra step down trend test, Dunn test, one way ANOVA, weighted ANOVA, mixed effects ANOVA, repeated measures ANOVA, and Dunnett test. This document provides a User’s Manual (termed a Vignette by the Comprehensive R Archive Network (CRAN)) for the previously created R-code tool called StatCharrms (Statistical analysis of Chemistry, Histopathology, and Reproduction endpoints using Repeated measures and Multi-generation Studies). The StatCharrms R-code has been publically available directly from EPA staff since the approval of OCSPP 890.2200 and 890.2300, and now is available publically available at the CRAN.
From Exploratory Talk to Abstract Reasoning: A Case for Far Transfer?
ERIC Educational Resources Information Center
Webb, Paul; Whitlow, J. W., Jr.; Venter, Danie
2017-01-01
Research has shown improvements in science, mathematics, and language scores when classroom discussion is employed in school-level science and mathematics classes. Studies have also shown statistically and practically significant gains in children's reasoning abilities as measured by the Raven's Standard Progressive Matrices test when employing…
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-05
... determination method (AEDM) for small electric motors, including the statistical requirements to substantiate... restriction to a particular application or type of application; or (2) Standard operating characteristics or... application, and which can be used in most general purpose applications. [[Page 652
School Libraries and Science Achievement: A View from Michigan's Middle Schools
ERIC Educational Resources Information Center
Mardis, Marcia
2007-01-01
If strong school library media centers (SLMCs) positively impact middle school student reading achievement, as measured on standardized tests, are they also beneficial for middle school science achievement? To answer this question, the researcher built upon the statistical analyses used in previous school library impact studies with qualitative…
Hofman, Abe D.; Visser, Ingmar; Jansen, Brenda R. J.; van der Maas, Han L. J.
2015-01-01
We propose and test three statistical models for the analysis of children’s responses to the balance scale task, a seminal task to study proportional reasoning. We use a latent class modelling approach to formulate a rule-based latent class model (RB LCM) following from a rule-based perspective on proportional reasoning and a new statistical model, the Weighted Sum Model, following from an information-integration approach. Moreover, a hybrid LCM using item covariates is proposed, combining aspects of both a rule-based and information-integration perspective. These models are applied to two different datasets, a standard paper-and-pencil test dataset (N = 779), and a dataset collected within an online learning environment that included direct feedback, time-pressure, and a reward system (N = 808). For the paper-and-pencil dataset the RB LCM resulted in the best fit, whereas for the online dataset the hybrid LCM provided the best fit. The standard paper-and-pencil dataset yielded more evidence for distinct solution rules than the online data set in which quantitative item characteristics are more prominent in determining responses. These results shed new light on the discussion on sequential rule-based and information-integration perspectives of cognitive development. PMID:26505905
Cross-validation of Peak Oxygen Consumption Prediction Models From OMNI Perceived Exertion.
Mays, R J; Goss, F L; Nagle, E F; Gallagher, M; Haile, L; Schafer, M A; Kim, K H; Robertson, R J
2016-09-01
This study cross-validated statistical models for prediction of peak oxygen consumption using ratings of perceived exertion from the Adult OMNI Cycle Scale of Perceived Exertion. 74 participants (men: n=36; women: n=38) completed a graded cycle exercise test. Ratings of perceived exertion for the overall body, legs, and chest/breathing were recorded each test stage and entered into previously developed 3-stage peak oxygen consumption prediction models. There were no significant differences (p>0.05) between measured and predicted peak oxygen consumption from ratings of perceived exertion for the overall body, legs, and chest/breathing within men (mean±standard deviation: 3.16±0.52 vs. 2.92±0.33 vs. 2.90±0.29 vs. 2.90±0.26 L·min(-1)) and women (2.17±0.29 vs. 2.02±0.22 vs. 2.03±0.19 vs. 2.01±0.19 L·min(-1)) participants. Previously developed statistical models for prediction of peak oxygen consumption based on subpeak OMNI ratings of perceived exertion responses were similar to measured peak oxygen consumption in a separate group of participants. These findings provide practical implications for the use of the original statistical models in standard health-fitness settings. © Georg Thieme Verlag KG Stuttgart · New York.
Efforts to improve international migration statistics: a historical perspective.
Kraly, E P; Gnanasekaran, K S
1987-01-01
During the past decade, the international statistical community has made several efforts to develop standards for the definition, collection and publication of statistics on international migration. This article surveys the history of official initiatives to standardize international migration statistics by reviewing the recommendations of the International Statistical Institute, International Labor Organization, and the UN, and reports a recently proposed agenda for moving toward comparability among national statistical systems. Heightening awareness of the benefits of exchange and creating motivation to implement international standards requires a 3-pronged effort from the international statistical community. 1st, it is essential to continue discussion about the significance of improvement, specifically standardization, of international migration statistics. The move from theory to practice in this area requires ongoing focus by migration statisticians so that conformity to international standards itself becomes a criterion by which national statistical practices are examined and assessed. 2nd, the countries should be provided with technical documentation to support and facilitate the implementation of the recommended statistical systems. Documentation should be developed with an understanding that conformity to international standards for migration and travel statistics must be achieved within existing national statistical programs. 3rd, the call for statistical research in this area requires more efforts by the community of migration statisticians, beginning with the mobilization of bilateral and multilateral resources to undertake the preceding list of activities.
pcr: an R package for quality assessment, analysis and testing of qPCR data
Ahmed, Mahmoud
2018-01-01
Background Real-time quantitative PCR (qPCR) is a broadly used technique in the biomedical research. Currently, few different analysis models are used to determine the quality of data and to quantify the mRNA level across the experimental conditions. Methods We developed an R package to implement methods for quality assessment, analysis and testing qPCR data for statistical significance. Double Delta CT and standard curve models were implemented to quantify the relative expression of target genes from CT in standard qPCR control-group experiments. In addition, calculation of amplification efficiency and curves from serial dilution qPCR experiments are used to assess the quality of the data. Finally, two-group testing and linear models were used to test for significance of the difference in expression control groups and conditions of interest. Results Using two datasets from qPCR experiments, we applied different quality assessment, analysis and statistical testing in the pcr package and compared the results to the original published articles. The final relative expression values from the different models, as well as the intermediary outputs, were checked against the expected results in the original papers and were found to be accurate and reliable. Conclusion The pcr package provides an intuitive and unified interface for its main functions to allow biologist to perform all necessary steps of qPCR analysis and produce graphs in a uniform way. PMID:29576953
75 FR 37245 - 2010 Standards for Delineating Metropolitan and Micropolitan Statistical Areas
Federal Register 2010, 2011, 2012, 2013, 2014
2010-06-28
... Micropolitan Statistical Areas; Notice #0;#0;Federal Register / Vol. 75, No. 123 / Monday, June 28, 2010... and Micropolitan Statistical Areas AGENCY: Office of Information and Regulatory Affairs, Office of... Statistical Areas. The 2010 standards replace and supersede the 2000 Standards for Defining Metropolitan and...
Test-retest reliability of 3D ultrasound measurements of the thoracic spine.
Fölsch, Christian; Schlögel, Stefanie; Lakemeier, Stefan; Wolf, Udo; Timmesfeld, Nina; Skwara, Adrian
2012-05-01
To explore the reliability of the Zebris CMS 20 ultrasound analysis system with pointer application for measuring end-range flexion, end-range extension, and neutral kyphosis angle of the thoracic spine. The study was performed within the School of Physiotherapy in cooperation with the Orthopedic Department at a University Hospital. The thoracic spines of 28 healthy subjects were measured. Measurements for neutral kyphosis angle, end-range flexion, and end-range extension were taken once at each time point. The bone landmarks were palpated by one examiner and marked with a pointer containing 2 transmitters using a frequency of 40 kHz. A third transmitter was fixed to the pelvis, and 3 microphones were used as receiver. The real angle was calculated by the software. Bland-Altman plots with 95% limits of agreement, intraclass correlations (ICC), standard deviations of mean measurements, and standard error of measurements were used for statistical analyses. The test-retest reliability in this study was measured within a 24-hour interval. Statistical parameters were used to judge reliability. The mean kyphosis angle was 44.8° with a standard deviation of 17.3° at the first measurement and a mean of 45.8° with a standard deviation of 16.2° the following day. The ICC was high at 0.95 for the neutral kyphosis angle, and the Bland-Altman 95% limits of agreement were within clinical acceptable margins. The ICC was 0.71 for end-range flexion and 0.34 for end-range extension, whereas the Bland-Altman 95% limits of agreement were wider than with the static measurement of kyphosis. Compared with static measurements, the analysis of motion with 3-dimensional ultrasound showed an increased standard deviation for test-retest measurements. The test-retest reliability of ultrasound measuring of the neutral kyphosis angle of the thoracic spine was demonstrated within 24 hours. Bland-Altman 95% limits of agreement and the standard deviation of differences did not appear to be clinically acceptable for measuring flexion and extension. Copyright © 2012 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Jacobson, Magdalena; Wallgren, Per; Nordengrahn, Ann; Merza, Malik; Emanuelson, Ulf
2011-04-01
Lawsonia intracellularis is a common cause of chronic diarrhoea and poor performance in young growing pigs. Diagnosis of this obligate intracellular bacterium is based on the demonstration of the microbe or microbial DNA in tissue specimens or faecal samples, or the demonstration of L. intracellularis-specific antibodies in sera. The aim of the present study was to evaluate a blocking ELISA in the detection of serum antibodies to L. intracellularis, by comparison to the previously widely used immunofluorescent antibody test (IFAT). Sera were collected from 176 pigs aged 8-12 weeks originating from 24 herds with or without problems with diarrhoea and poor performance in young growing pigs. Sera were analyzed by the blocking ELISA and by IFAT. Bayesian modelling techniques were used to account for the absence of a gold standard test and the results of the blocking ELISA was modelled against the IFAT test with a "2 dependent tests, 2 populations, no gold standard" model. At the finally selected cut-off value of percent inhibition (PI) 35, the diagnostic sensitivity of the blocking ELISA was 72% and the diagnostic specificity was 93%. The positive predictive value was 0.82 and the negative predictive value was 0.89, at the observed prevalence of 33.5%. The sensitivity and specificity as evaluated by Bayesian statistic techniques differed from that previously reported. Properties of diagnostic tests may well vary between countries, laboratories and among populations of animals. In the absence of a true gold standard, the importance of validating new methods by appropriate statistical methods and with respect to the target population must be emphasized.
Valle, Denis; Lima, Joanna M Tucker; Millar, Justin; Amratia, Punam; Haque, Ubydul
2015-11-04
Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue. A systematic literature review was conducted to determine the proportion of malaria studies that appropriately accounted for false-negatives/false-positives in a logistic regression setting. Inference from the standard logistic regression was also compared with that from three proposed Bayesian models using simulations and malaria data from the western Brazilian Amazon. A systematic literature review suggests that malaria epidemiologists are largely unaware of the problem of using logistic regression to model imperfect diagnostic test results. Simulation results reveal that statistical inference can be substantially improved when using the proposed Bayesian models versus the standard logistic regression. Finally, analysis of original malaria data with one of the proposed Bayesian models reveals that microscopy sensitivity is strongly influenced by how long people have lived in the study region, and an important risk factor (i.e., participation in forest extractivism) is identified that would have been missed by standard logistic regression. Given the numerous diagnostic methods employed by malaria researchers and the ubiquitous use of logistic regression to model the results of these diagnostic tests, this paper provides critical guidelines to improve data analysis practice in the presence of misclassification error. Easy-to-use code that can be readily adapted to WinBUGS is provided, enabling straightforward implementation of the proposed Bayesian models.
Flow Chamber System for the Statistical Evaluation of Bacterial Colonization on Materials
Menzel, Friederike; Conradi, Bianca; Rodenacker, Karsten; Gorbushina, Anna A.; Schwibbert, Karin
2016-01-01
Biofilm formation on materials leads to high costs in industrial processes, as well as in medical applications. This fact has stimulated interest in the development of new materials with improved surfaces to reduce bacterial colonization. Standardized tests relying on statistical evidence are indispensable to evaluate the quality and safety of these new materials. We describe here a flow chamber system for biofilm cultivation under controlled conditions with a total capacity for testing up to 32 samples in parallel. In order to quantify the surface colonization, bacterial cells were DAPI (4`,6-diamidino-2-phenylindole)-stained and examined with epifluorescence microscopy. More than 100 images of each sample were automatically taken and the surface coverage was estimated using the free open source software g’mic, followed by a precise statistical evaluation. Overview images of all gathered pictures were generated to dissect the colonization characteristics of the selected model organism Escherichia coli W3310 on different materials (glass and implant steel). With our approach, differences in bacterial colonization on different materials can be quantified in a statistically validated manner. This reliable test procedure will support the design of improved materials for medical, industrial, and environmental (subaquatic or subaerial) applications. PMID:28773891
Statistical distribution of mechanical properties for three graphite-epoxy material systems
NASA Technical Reports Server (NTRS)
Reese, C.; Sorem, J., Jr.
1981-01-01
Graphite-epoxy composites are playing an increasing role as viable alternative materials in structural applications necessitating thorough investigation into the predictability and reproducibility of their material strength properties. This investigation was concerned with tension, compression, and short beam shear coupon testing of large samples from three different material suppliers to determine their statistical strength behavior. Statistical results indicate that a two Parameter Weibull distribution model provides better overall characterization of material behavior for the graphite-epoxy systems tested than does the standard Normal distribution model that is employed for most design work. While either a Weibull or Normal distribution model provides adequate predictions for average strength values, the Weibull model provides better characterization in the lower tail region where the predictions are of maximum design interest. The two sets of the same material were found to have essentially the same material properties, and indicate that repeatability can be achieved.
NASA Astrophysics Data System (ADS)
Powell, P. E.
Educators have recently come to consider inquiry based instruction as a more effective method of instruction than didactic instruction. Experience based learning theory suggests that student performance is linked to teaching method. However, research is limited on inquiry teaching and its effectiveness on preparing students to perform well on standardized tests. The purpose of the study to investigate whether one of these two teaching methodologies was more effective in increasing student performance on standardized science tests. The quasi experimental quantitative study was comprised of two stages. Stage 1 used a survey to identify teaching methods of a convenience sample of 57 teacher participants and determined level of inquiry used in instruction to place participants into instructional groups (the independent variable). Stage 2 used analysis of covariance (ANCOVA) to compare posttest scores on a standardized exam by teaching method. Additional analyses were conducted to examine the differences in science achievement by ethnicity, gender, and socioeconomic status by teaching methodology. Results demonstrated a statistically significant gain in test scores when taught using inquiry based instruction. Subpopulation analyses indicated all groups showed improved mean standardized test scores except African American students. The findings benefit teachers and students by presenting data supporting a method of content delivery that increases teacher efficacy and produces students with a greater cognition of science content that meets the school's mission and goals.
Assessment of the beryllium lymphocyte proliferation test using statistical process control.
Cher, Daniel J; Deubner, David C; Kelsh, Michael A; Chapman, Pamela S; Ray, Rose M
2006-10-01
Despite more than 20 years of surveillance and epidemiologic studies using the beryllium blood lymphocyte proliferation test (BeBLPT) as a measure of beryllium sensitization (BeS) and as an aid for diagnosing subclinical chronic beryllium disease (CBD), improvements in specific understanding of the inhalation toxicology of CBD have been limited. Although epidemiologic data suggest that BeS and CBD risks vary by process/work activity, it has proven difficult to reach specific conclusions regarding the dose-response relationship between workplace beryllium exposure and BeS or subclinical CBD. One possible reason for this uncertainty could be misclassification of BeS resulting from variation in BeBLPT testing performance. The reliability of the BeBLPT, a biological assay that measures beryllium sensitization, is unknown. To assess the performance of four laboratories that conducted this test, we used data from a medical surveillance program that offered testing for beryllium sensitization with the BeBLPT. The study population was workers exposed to beryllium at various facilities over a 10-year period (1992-2001). Workers with abnormal results were offered diagnostic workups for CBD. Our analyses used a standard statistical technique, statistical process control (SPC), to evaluate test reliability. The study design involved a repeated measures analysis of BeBLPT results generated from the company-wide, longitudinal testing. Analytical methods included use of (1) statistical process control charts that examined temporal patterns of variation for the stimulation index, a measure of cell reactivity to beryllium; (2) correlation analysis that compared prior perceptions of BeBLPT instability to the statistical measures of test variation; and (3) assessment of the variation in the proportion of missing test results and how time periods with more missing data influenced SPC findings. During the period of this study, all laboratories displayed variation in test results that were beyond what would be expected due to chance alone. Patterns of test results suggested that variations were systematic. We conclude that laboratories performing the BeBLPT or other similar biological assays of immunological response could benefit from a statistical approach such as SPC to improve quality management.
Measures of accuracy and performance of diagnostic tests.
Drobatz, Kenneth J
2009-05-01
Diagnostic tests are integral to the practice of veterinary cardiology, any other specialty, and general veterinary medicine. Developing and understanding diagnostic tests is one of the cornerstones of clinical research. This manuscript describes the diagnostic test properties including sensitivity, specificity, predictive value, likelihood ratio, receiver operating characteristic curve. Review of practical book chapters and standard statistics manuscripts. Diagnostics such as sensitivity, specificity, predictive value, likelihood ratio, and receiver operating characteristic curve are described and illustrated. Basic understanding of how diagnostic tests are developed and interpreted is essential in reviewing clinical scientific papers and understanding evidence based medicine.
Foerster, Rebecca M.; Poth, Christian H.; Behler, Christian; Botsch, Mario; Schneider, Werner X.
2016-01-01
Neuropsychological assessment of human visual processing capabilities strongly depends on visual testing conditions including room lighting, stimuli, and viewing-distance. This limits standardization, threatens reliability, and prevents the assessment of core visual functions such as visual processing speed. Increasingly available virtual reality devices allow to address these problems. One such device is the portable, light-weight, and easy-to-use Oculus Rift. It is head-mounted and covers the entire visual field, thereby shielding and standardizing the visual stimulation. A fundamental prerequisite to use Oculus Rift for neuropsychological assessment is sufficient test-retest reliability. Here, we compare the test-retest reliabilities of Bundesen’s visual processing components (visual processing speed, threshold of conscious perception, capacity of visual working memory) as measured with Oculus Rift and a standard CRT computer screen. Our results show that Oculus Rift allows to measure the processing components as reliably as the standard CRT. This means that Oculus Rift is applicable for standardized and reliable assessment and diagnosis of elementary cognitive functions in laboratory and clinical settings. Oculus Rift thus provides the opportunity to compare visual processing components between individuals and institutions and to establish statistical norm distributions. PMID:27869220
Yu, Chen; Zhang, Qian; Xu, Peng-Yao; Bai, Yin; Shen, Wen-Bin; Di, Bin; Su, Meng-Xiang
2018-01-01
Quantitative nuclear magnetic resonance (qNMR) is a well-established technique in quantitative analysis. We presented a validated 1 H-qNMR method for assay of octreotide acetate, a kind of cyclic octopeptide. Deuterium oxide was used to remove the undesired exchangeable peaks, which was referred to as proton exchange, in order to make the quantitative signals isolated in the crowded spectrum of the peptide and ensure precise quantitative analysis. Gemcitabine hydrochloride was chosen as the suitable internal standard. Experimental conditions, including relaxation delay time, the numbers of scans, and pulse angle, were optimized first. Then method validation was carried out in terms of selectivity, stability, linearity, precision, and robustness. The assay result was compared with that by means of high performance liquid chromatography, which is provided by Chinese Pharmacopoeia. The statistical F test, Student's t test, and nonparametric test at 95% confidence level indicate that there was no significant difference between these two methods. qNMR is a simple and accurate quantitative tool with no need for specific corresponding reference standards. It has the potential of the quantitative analysis of other peptide drugs and standardization of the corresponding reference standards. Copyright © 2017 John Wiley & Sons, Ltd.
Foerster, Rebecca M; Poth, Christian H; Behler, Christian; Botsch, Mario; Schneider, Werner X
2016-11-21
Neuropsychological assessment of human visual processing capabilities strongly depends on visual testing conditions including room lighting, stimuli, and viewing-distance. This limits standardization, threatens reliability, and prevents the assessment of core visual functions such as visual processing speed. Increasingly available virtual reality devices allow to address these problems. One such device is the portable, light-weight, and easy-to-use Oculus Rift. It is head-mounted and covers the entire visual field, thereby shielding and standardizing the visual stimulation. A fundamental prerequisite to use Oculus Rift for neuropsychological assessment is sufficient test-retest reliability. Here, we compare the test-retest reliabilities of Bundesen's visual processing components (visual processing speed, threshold of conscious perception, capacity of visual working memory) as measured with Oculus Rift and a standard CRT computer screen. Our results show that Oculus Rift allows to measure the processing components as reliably as the standard CRT. This means that Oculus Rift is applicable for standardized and reliable assessment and diagnosis of elementary cognitive functions in laboratory and clinical settings. Oculus Rift thus provides the opportunity to compare visual processing components between individuals and institutions and to establish statistical norm distributions.
Alternative Test Methods for Electronic Parts
NASA Technical Reports Server (NTRS)
Plante, Jeannette
2004-01-01
It is common practice within NASA to test electronic parts at the manufacturing lot level to demonstrate, statistically, that parts from the lot tested will not fail in service using generic application conditions. The test methods and the generic application conditions used have been developed over the years through cooperation between NASA, DoD, and industry in order to establish a common set of standard practices. These common practices, found in MIL-STD-883, MIL-STD-750, military part specifications, EEE-INST-002, and other guidelines are preferred because they are considered to be effective and repeatable and their results are usually straightforward to interpret. These practices can sometimes be unavailable to some NASA projects due to special application conditions that must be addressed, such as schedule constraints, cost constraints, logistical constraints, or advances in the technology that make the historical standards an inappropriate choice for establishing part performance and reliability. Alternate methods have begun to emerge and to be used by NASA programs to test parts individually or as part of a system, especially when standard lot tests cannot be applied. Four alternate screening methods will be discussed in this paper: Highly accelerated life test (HALT), forward voltage drop tests for evaluating wire-bond integrity, burn-in options during or after highly accelerated stress test (HAST), and board-level qualification.
NASA Astrophysics Data System (ADS)
Badini, L.; Grassi, F.; Pignari, S. A.; Spadacini, G.; Bisognin, P.; Pelissou, P.; Marra, S.
2016-05-01
This work presents a theoretical rationale for the substitution of radiated-susceptibility (RS) verifications defined in current aerospace standards with an equivalent conducted-susceptibility (CS) test procedure based on bulk current injection (BCI) up to 500 MHz. Statistics is used to overcome the lack of knowledge about uncontrolled or uncertain setup parameters, with particular reference to the common-mode impedance of equipment. The BCI test level is properly investigated so to ensure correlation of currents injected in the equipment under test via CS and RS. In particular, an over-testing probability quantifies the severity of the BCI test with respect to the RS test.
Cenciani de Souza, Camila Prado; Aparecida de Abreu, Cleide; Coscione, Aline Renée; Alberto de Andrade, Cristiano; Teixeira, Luiz Antonio Junqueira; Consolini, Flavia
2018-01-01
Rapid, accurate, and low-cost alternative analytical methods for micronutrient quantification in fertilizers are fundamental in QC. The purpose of this study was to evaluate whether zinc (Zn) and copper (Cu) content in mineral fertilizers and industrial by-products determined by the alternative methods USEPA 3051a, 10% HCl, and 10% H2SO4 are statistically equivalent to the standard method, consisting of hot-plate digestion using concentrated HCl. The commercially marketed Zn and Cu sources in Brazil consisted of oxides, carbonate, and sulfate fertilizers and by-products consisting of galvanizing ash, galvanizing sludge, brass ash, and brass or scrap slag. The contents of sources ranged from 15 to 82% and 10 to 45%, respectively, for Zn and Cu. The Zn and Cu contents refer to the variation of the elements found in the different sources evaluated with the concentrated HCl method as shown in Table 1. A protocol based on the following criteria was used for the statistical analysis assessment of the methods: F-test modified by Graybill, t-test for the mean error, and linear correlation coefficient analysis. In terms of equivalents, 10% HCl extraction was equivalent to the standard method for Zn, and the results of the USEPA 3051a and 10% HCl methods indicated that these methods were equivalents for Cu. Therefore, these methods can be considered viable alternatives to the standard method of determination for Cu and Zn in mineral fertilizers and industrial by-products in future research for their complete validation.
Calhelha, Ricardo C; Martínez, Mireia A; Prieto, M A; Ferreira, Isabel C F R
2017-10-23
The development of convenient tools for describing and quantifying the effects of standard and novel therapeutic agents is essential for the research community, to perform more precise evaluations. Although mathematical models and quantification criteria have been exchanged in the last decade between different fields of study, there are relevant methodologies that lack proper mathematical descriptions and standard criteria to quantify their responses. Therefore, part of the relevant information that can be drawn from the experimental results obtained and the quantification of its statistical reliability are lost. Despite its relevance, there is not a standard form for the in vitro endpoint tumor cell lines' assays (TCLA) that enables the evaluation of the cytotoxic dose-response effects of anti-tumor drugs. The analysis of all the specific problems associated with the diverse nature of the available TCLA used is unfeasible. However, since most TCLA share the main objectives and similar operative requirements, we have chosen the sulforhodamine B (SRB) colorimetric assay for cytotoxicity screening of tumor cell lines as an experimental case study. In this work, the common biological and practical non-linear dose-response mathematical models are tested against experimental data and, following several statistical analyses, the model based on the Weibull distribution was confirmed as the convenient approximation to test the cytotoxic effectiveness of anti-tumor compounds. Then, the advantages and disadvantages of all the different parametric criteria derived from the model, which enable the quantification of the dose-response drug-effects, are extensively discussed. Therefore, model and standard criteria for easily performing the comparisons between different compounds are established. The advantages include a simple application, provision of parametric estimations that characterize the response as standard criteria, economization of experimental effort and enabling rigorous comparisons among the effects of different compounds and experimental approaches. In all experimental data fitted, the calculated parameters were always statistically significant, the equations proved to be consistent and the correlation coefficient of determination was, in most of the cases, higher than 0.98.
Battery Calendar Life Estimator Manual Modeling and Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jon P. Christophersen; Ira Bloom; Ed Thomas
2012-10-01
The Battery Life Estimator (BLE) Manual has been prepared to assist developers in their efforts to estimate the calendar life of advanced batteries for automotive applications. Testing requirements and procedures are defined by the various manuals previously published under the United States Advanced Battery Consortium (USABC). The purpose of this manual is to describe and standardize a method for estimating calendar life based on statistical models and degradation data acquired from typical USABC battery testing.
Battery Life Estimator Manual Linear Modeling and Simulation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jon P. Christophersen; Ira Bloom; Ed Thomas
2009-08-01
The Battery Life Estimator (BLE) Manual has been prepared to assist developers in their efforts to estimate the calendar life of advanced batteries for automotive applications. Testing requirements and procedures are defined by the various manuals previously published under the United States Advanced Battery Consortium (USABC). The purpose of this manual is to describe and standardize a method for estimating calendar life based on statistical models and degradation data acquired from typical USABC battery testing.
Heart Rate Variability Dynamics for the Prognosis of Cardiovascular Risk
Ramirez-Villegas, Juan F.; Lam-Espinosa, Eric; Ramirez-Moreno, David F.; Calvo-Echeverry, Paulo C.; Agredo-Rodriguez, Wilfredo
2011-01-01
Statistical, spectral, multi-resolution and non-linear methods were applied to heart rate variability (HRV) series linked with classification schemes for the prognosis of cardiovascular risk. A total of 90 HRV records were analyzed: 45 from healthy subjects and 45 from cardiovascular risk patients. A total of 52 features from all the analysis methods were evaluated using standard two-sample Kolmogorov-Smirnov test (KS-test). The results of the statistical procedure provided input to multi-layer perceptron (MLP) neural networks, radial basis function (RBF) neural networks and support vector machines (SVM) for data classification. These schemes showed high performances with both training and test sets and many combinations of features (with a maximum accuracy of 96.67%). Additionally, there was a strong consideration for breathing frequency as a relevant feature in the HRV analysis. PMID:21386966
An instrument to assess the statistical intensity of medical research papers.
Nieminen, Pentti; Virtanen, Jorma I; Vähänikkilä, Hannu
2017-01-01
There is widespread evidence that statistical methods play an important role in original research articles, especially in medical research. The evaluation of statistical methods and reporting in journals suffers from a lack of standardized methods for assessing the use of statistics. The objective of this study was to develop and evaluate an instrument to assess the statistical intensity in research articles in a standardized way. A checklist-type measure scale was developed by selecting and refining items from previous reports about the statistical contents of medical journal articles and from published guidelines for statistical reporting. A total of 840 original medical research articles that were published between 2007-2015 in 16 journals were evaluated to test the scoring instrument. The total sum of all items was used to assess the intensity between sub-fields and journals. Inter-rater agreement was examined using a random sample of 40 articles. Four raters read and evaluated the selected articles using the developed instrument. The scale consisted of 66 items. The total summary score adequately discriminated between research articles according to their study design characteristics. The new instrument could also discriminate between journals according to their statistical intensity. The inter-observer agreement measured by the ICC was 0.88 between all four raters. Individual item analysis showed very high agreement between the rater pairs, the percentage agreement ranged from 91.7% to 95.2%. A reliable and applicable instrument for evaluating the statistical intensity in research papers was developed. It is a helpful tool for comparing the statistical intensity between sub-fields and journals. The novel instrument may be applied in manuscript peer review to identify papers in need of additional statistical review.
Dragunsky, Eugenia; Nomura, Tatsuji; Karpinski, Kazimir; Furesz, John; Wood, David J.; Pervikov, Yuri; Abe, Shinobu; Kurata, Takeshi; Vanloocke, Olivier; Karganova, Galina; Taffs, Rolf; Heath, Alan; Ivshina, Anna; Levenbook, Inessa
2003-01-01
OBJECTIVE: Extensive WHO collaborative studies were performed to evaluate the suitability of transgenic mice susceptible to poliovirus (TgPVR mice, strain 21, bred and provided by the Central Institute for Experimental Animals, Japan) as an alternative to monkeys in the neurovirulence test (NVT) of oral poliovirus vaccine (OPV). METHODS: Nine laboratories participated in the collaborative study on testing neurovirulence of 94 preparations of OPV and vaccine derivatives of all three serotypes in TgPVR21 mice. FINDINGS: Statistical analysis of the data demonstrated that the TgPVR21 mouse NVT was of comparable sensitivity and reproducibility to the conventional WHO NVT in simians. A statistical model for acceptance/rejection of OPV lots in the mouse test was developed, validated, and shown to be suitable for all three vaccine types. The assessment of the transgenic mouse NVT is based on clinical evaluation of paralysed mice. Unlike the monkey NVT, histological examination of central nervous system tissue of each mouse offered no advantage over careful and detailed clinical observation. CONCLUSIONS: Based on data from the collaborative studies the WHO Expert Committee for Biological Standardization approved the mouse NVT as an alternative to the monkey test for all three OPV types and defined a standard implementation process for laboratories that wish to use the test. This represents the first successful introduction of transgenic animals into control of biologicals. PMID:12764491
MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data
Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong
2015-01-01
Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201
Bayesian inference for psychology. Part II: Example applications with JASP.
Wagenmakers, Eric-Jan; Love, Jonathon; Marsman, Maarten; Jamil, Tahira; Ly, Alexander; Verhagen, Josine; Selker, Ravi; Gronau, Quentin F; Dropmann, Damian; Boutin, Bruno; Meerhoff, Frans; Knight, Patrick; Raj, Akash; van Kesteren, Erik-Jan; van Doorn, Johnny; Šmíra, Martin; Epskamp, Sacha; Etz, Alexander; Matzke, Dora; de Jong, Tim; van den Bergh, Don; Sarafoglou, Alexandra; Steingroever, Helen; Derks, Koen; Rouder, Jeffrey N; Morey, Richard D
2018-02-01
Bayesian hypothesis testing presents an attractive alternative to p value hypothesis testing. Part I of this series outlined several advantages of Bayesian hypothesis testing, including the ability to quantify evidence and the ability to monitor and update this evidence as data come in, without the need to know the intention with which the data were collected. Despite these and other practical advantages, Bayesian hypothesis tests are still reported relatively rarely. An important impediment to the widespread adoption of Bayesian tests is arguably the lack of user-friendly software for the run-of-the-mill statistical problems that confront psychologists for the analysis of almost every experiment: the t-test, ANOVA, correlation, regression, and contingency tables. In Part II of this series we introduce JASP ( http://www.jasp-stats.org ), an open-source, cross-platform, user-friendly graphical software package that allows users to carry out Bayesian hypothesis tests for standard statistical problems. JASP is based in part on the Bayesian analyses implemented in Morey and Rouder's BayesFactor package for R. Armed with JASP, the practical advantages of Bayesian hypothesis testing are only a mouse click away.
Laslett, Mark; McDonald, Barry; Tropp, Hans; Aprill, Charles N; Öberg, Birgitta
2005-01-01
Background The tissue origin of low back pain (LBP) or referred lower extremity symptoms (LES) may be identified in about 70% of cases using advanced imaging, discography and facet or sacroiliac joint blocks. These techniques are invasive and availability varies. A clinical examination is non-invasive and widely available but its validity is questioned. Diagnostic studies usually examine single tests in relation to single reference standards, yet in clinical practice, clinicians use multiple tests and select from a range of possible diagnoses. There is a need for studies that evaluate the diagnostic performance of clinical diagnoses against available reference standards. Methods We compared blinded clinical diagnoses with diagnoses based on available reference standards for known causes of LBP or LES such as discography, facet, sacroiliac or hip joint blocks, epidurals injections, advanced imaging studies or any combination of these tests. A prospective, blinded validity design was employed. Physiotherapists examined consecutive patients with chronic lumbopelvic pain and/or referred LES scheduled to receive the reference standard examinations. When diagnoses were in complete agreement regardless of complexity, "exact" agreement was recorded. When the clinical diagnosis was included within the reference standard diagnoses, "clinical agreement" was recorded. The proportional chance criterion (PCC) statistic was used to estimate agreement on multiple diagnostic possibilities because it accounts for the prevalence of individual categories in the sample. The kappa statistic was used to estimate agreement on six pathoanatomic diagnoses. Results In a sample of chronic LBP patients (n = 216) with high levels of disability and distress, 67% received a patho-anatomic diagnosis based on available reference standards, and 10% had more than one tissue origin of pain identified. For 27 diagnostic categories and combinations, chance clinical agreement (PCC) was estimated at 13%. "Exact" agreement between clinical and reference standard diagnoses was 32% and "clinical agreement" 51%. For six pathoanatomic categories (disc, facet joint, sacroiliac joint, hip joint, nerve root and spinal stenosis), PCC was 33% with actual agreement 56%. There was no overlap of 95% confidence intervals on any comparison. Diagnostic agreement on the six most common patho-anatomic categories produced a kappa of 0.31. Conclusion Clinical diagnoses agree with reference standards diagnoses more often than chance. Using available reference standards, most patients can have a tissue source of pain identified. PMID:15943873
On Statistical Approaches for Demonstrating Analytical Similarity in the Presence of Correlation.
Yang, Harry; Novick, Steven; Burdick, Richard K
Analytical similarity is the foundation for demonstration of biosimilarity between a proposed product and a reference product. For this assessment, currently the U.S. Food and Drug Administration (FDA) recommends a tiered system in which quality attributes are categorized into three tiers commensurate with their risk and approaches of varying statistical rigor are subsequently used for the three-tier quality attributes. Key to the analyses of Tiers 1 and 2 quality attributes is the establishment of equivalence acceptance criterion and quality range. For particular licensure applications, the FDA has provided advice on statistical methods for demonstration of analytical similarity. For example, for Tier 1 assessment, an equivalence test can be used based on an equivalence margin of 1.5 σ R , where σ R is the reference product variability estimated by the sample standard deviation S R from a sample of reference lots. The quality range for demonstrating Tier 2 analytical similarity is of the form X̄ R ± K × σ R where the constant K is appropriately justified. To demonstrate Tier 2 analytical similarity, a large percentage (e.g., 90%) of test product must fall in the quality range. In this paper, through both theoretical derivations and simulations, we show that when the reference drug product lots are correlated, the sample standard deviation S R underestimates the true reference product variability σ R As a result, substituting S R for σ R in the Tier 1 equivalence acceptance criterion and the Tier 2 quality range inappropriately reduces the statistical power and the ability to declare analytical similarity. Also explored is the impact of correlation among drug product lots on Type I error rate and power. Three methods based on generalized pivotal quantities are introduced, and their performance is compared against a two-one-sided tests (TOST) approach. Finally, strategies to mitigate risk of correlation among the reference products lots are discussed. A biosimilar is a generic version of the original biological drug product. A key component of a biosimilar development is the demonstration of analytical similarity between the biosimilar and the reference product. Such demonstration relies on application of statistical methods to establish a similarity margin and appropriate test for equivalence between the two products. This paper discusses statistical issues with demonstration of analytical similarity and provides alternate approaches to potentially mitigate these problems. © PDA, Inc. 2016.
Potential errors and misuse of statistics in studies on leakage in endodontics.
Lucena, C; Lopez, J M; Pulgar, R; Abalos, C; Valderrama, M J
2013-04-01
To assess the quality of the statistical methodology used in studies of leakage in Endodontics, and to compare the results found using appropriate versus inappropriate inferential statistical methods. The search strategy used the descriptors 'root filling' 'microleakage', 'dye penetration', 'dye leakage', 'polymicrobial leakage' and 'fluid filtration' for the time interval 2001-2010 in journals within the categories 'Dentistry, Oral Surgery and Medicine' and 'Materials Science, Biomaterials' of the Journal Citation Report. All retrieved articles were reviewed to find potential pitfalls in statistical methodology that may be encountered during study design, data management or data analysis. The database included 209 papers. In all the studies reviewed, the statistical methods used were appropriate for the category attributed to the outcome variable, but in 41% of the cases, the chi-square test or parametric methods were inappropriately selected subsequently. In 2% of the papers, no statistical test was used. In 99% of cases, a statistically 'significant' or 'not significant' effect was reported as a main finding, whilst only 1% also presented an estimation of the magnitude of the effect. When the appropriate statistical methods were applied in the studies with originally inappropriate data analysis, the conclusions changed in 19% of the cases. Statistical deficiencies in leakage studies may affect their results and interpretation and might be one of the reasons for the poor agreement amongst the reported findings. Therefore, more effort should be made to standardize statistical methodology. © 2012 International Endodontic Journal.
SOCR: Statistics Online Computational Resource
Dinov, Ivo D.
2011-01-01
The need for hands-on computer laboratory experience in undergraduate and graduate statistics education has been firmly established in the past decade. As a result a number of attempts have been undertaken to develop novel approaches for problem-driven statistical thinking, data analysis and result interpretation. In this paper we describe an integrated educational web-based framework for: interactive distribution modeling, virtual online probability experimentation, statistical data analysis, visualization and integration. Following years of experience in statistical teaching at all college levels using established licensed statistical software packages, like STATA, S-PLUS, R, SPSS, SAS, Systat, etc., we have attempted to engineer a new statistics education environment, the Statistics Online Computational Resource (SOCR). This resource performs many of the standard types of statistical analysis, much like other classical tools. In addition, it is designed in a plug-in object-oriented architecture and is completely platform independent, web-based, interactive, extensible and secure. Over the past 4 years we have tested, fine-tuned and reanalyzed the SOCR framework in many of our undergraduate and graduate probability and statistics courses and have evidence that SOCR resources build student’s intuition and enhance their learning. PMID:21451741
NASA Astrophysics Data System (ADS)
Gao, Jike
2018-01-01
Through using the method of literature review, instrument measuring, questionnaire and mathematical statistics, this paper analyzed the current situation in Mass Sports of Tibetan Areas Plateau in Gansu Province. Through experimental test access to Tibetan areas in gansu province of air pollutants and meteorological index data as the foundation, control related national standard and exercise science, statistical analysis of data, the Tibetan plateau, gansu province people participate in physical exercise is dedicated to providing you with scientific methods and appropriate time.
Effects of Simplifying Choice Tasks on Estimates of Taste Heterogeneity in Stated-Choice Surveys
Johnson, F. Reed; Ozdemir, Semra; Phillips, Kathryn A
2011-01-01
Researchers usually employ orthogonal arrays or D-optimal designs with little or no attribute overlap in stated-choice surveys. The challenge is to balance statistical efficiency and respondent burden to minimize the overall error in the survey responses. This study examined whether simplifying the choice task, by using a design with more overlap, provides advantages over standard minimum-overlap methods. We administered two designs for eliciting HIV test preferences to split samples. Surveys were undertaken at four HIV testing locations in San Francisco, California. Personal characteristics had different effects on willingness to pay for the two treatments, and gains in statistical efficiency in the minimal-overlap version more than compensated for possible imprecision from increased measurement error. PMID:19880234
Comparison of a novel fixation device with standard suturing methods for spinal cord stimulators.
Bowman, Richard G; Caraway, David; Bentley, Ishmael
2013-01-01
Spinal cord stimulation is a well-established treatment for chronic neuropathic pain of the trunk or limbs. Currently, the standard method of fixation is to affix the leads of the neuromodulation device to soft tissue, fascia or ligament, through the use of manually tying general suture. A novel semiautomated device is proposed that may be advantageous to the current standard. Comparison testing in an excised caprine spine and simulated bench top model was performed. Three tests were performed: 1) perpendicular pull from fascia of caprine spine; 2) axial pull from fascia of caprine spine; and 3) axial pull from Mylar film. Six samples of each configuration were tested for each scenario. Standard 2-0 Ethibond was compared with a novel semiautomated device (Anulex fiXate). Upon completion of testing statistical analysis was performed for each scenario. For perpendicular pull in the caprine spine, the failure load for standard suture was 8.95 lbs with a standard deviation of 1.39 whereas for fiXate the load was 15.93 lbs with a standard deviation of 2.09. For axial pull in the caprine spine, the failure load for standard suture was 6.79 lbs with a standard deviation of 1.55 whereas for fiXate the load was 12.31 lbs with a standard deviation of 4.26. For axial pull in Mylar film, the failure load for standard suture was 10.87 lbs with a standard deviation of 1.56 whereas for fiXate the load was 19.54 lbs with a standard deviation of 2.24. These data suggest a novel semiautomated device offers a method of fixation that may be utilized in lieu of standard suturing methods as a means of securing neuromodulation devices. Data suggest the novel semiautomated device in fact may provide a more secure fixation than standard suturing methods. © 2012 International Neuromodulation Society.
ERIC Educational Resources Information Center
Zientek, Linda; Nimon, Kim; Hammack-Brown, Bryn
2016-01-01
Purpose: Among the gold standards in human resource development (HRD) research are studies that test theoretically developed hypotheses and use experimental designs. A somewhat typical experimental design would involve collecting pretest and posttest data on individuals assigned to a control or experimental group. Data from such a design that…
USDA-ARS?s Scientific Manuscript database
Comparing performance of a large number of accessions simultaneously is not always possible. Typically, only subsets of all accessions are tested in separate trials with only some (or none) of the accessions overlapping between subsets. Using standard statistical approaches to combine data from such...
ERIC Educational Resources Information Center
Schachter, Ron
2010-01-01
There are plenty of statistics available for measuring the performance, potential and problems of school districts, from standardized test scores to the number of students eligible for free or reduced-price lunch. Last June, another metric came into sharper focus when the U.S. Census Bureau released its latest state-by-state data on per-pupil…
Teacher Technology Acceptance and Usage for the Middle School Classroom
ERIC Educational Resources Information Center
Stone, Wilton, Jr.
2014-01-01
According to the U.S. Department of Education National Center for Education Statistics, students in the United States routinely perform poorly on international assessments. This study was focused specifically on the problem of the decrease in the number of middle school students meeting the requirements for one state's standardized tests for…
Location tests for biomarker studies: a comparison using simulations for the two-sample case.
Scheinhardt, M O; Ziegler, A
2013-01-01
Gene, protein, or metabolite expression levels are often non-normally distributed, heavy tailed and contain outliers. Standard statistical approaches may fail as location tests in this situation. In three Monte-Carlo simulation studies, we aimed at comparing the type I error levels and empirical power of standard location tests and three adaptive tests [O'Gorman, Can J Stat 1997; 25: 269 -279; Keselman et al., Brit J Math Stat Psychol 2007; 60: 267- 293; Szymczak et al., Stat Med 2013; 32: 524 - 537] for a wide range of distributions. We simulated two-sample scenarios using the g-and-k-distribution family to systematically vary tail length and skewness with identical and varying variability between groups. All tests kept the type I error level when groups did not vary in their variability. The standard non-parametric U-test performed well in all simulated scenarios. It was outperformed by the two non-parametric adaptive methods in case of heavy tails or large skewness. Most tests did not keep the type I error level for skewed data in the case of heterogeneous variances. The standard U-test was a powerful and robust location test for most of the simulated scenarios except for very heavy tailed or heavy skewed data, and it is thus to be recommended except for these cases. The non-parametric adaptive tests were powerful for both normal and non-normal distributions under sample variance homogeneity. But when sample variances differed, they did not keep the type I error level. The parametric adaptive test lacks power for skewed and heavy tailed distributions.
Establishing Inter- and Intrarater Reliability for High-Stakes Testing Using Simulation.
Kardong-Edgren, Suzan; Oermann, Marilyn H; Rizzolo, Mary Anne; Odom-Maryon, Tamara
This article reports one method to develop a standardized training method to establish the inter- and intrarater reliability of a group of raters for high-stakes testing. Simulation is used increasingly for high-stakes testing, but without research into the development of inter- and intrarater reliability for raters. Eleven raters were trained using a standardized methodology. Raters scored 28 student videos over a six-week period. Raters then rescored all videos over a two-day period to establish both intra- and interrater reliability. One rater demonstrated poor intrarater reliability; a second rater failed all students. Kappa statistics improved from the moderate to substantial agreement range with the exclusion of the two outlier raters' scores. There may be faculty who, for different reasons, should not be included in high-stakes testing evaluations. All faculty are content experts, but not all are expert evaluators.
Leak Rate Quantification Method for Gas Pressure Seals with Controlled Pressure Differential
NASA Technical Reports Server (NTRS)
Daniels, Christopher C.; Braun, Minel J.; Oravec, Heather A.; Mather, Janice L.; Taylor, Shawn C.
2015-01-01
An enhancement to the pressure decay leak rate method with mass point analysis solved deficiencies in the standard method. By adding a control system, a constant gas pressure differential across the test article was maintained. As a result, the desired pressure condition was met at the onset of the test, and the mass leak rate and measurement uncertainty were computed in real-time. The data acquisition and control system were programmed to automatically stop when specified criteria were met. Typically, the test was stopped when a specified level of measurement uncertainty was attained. Using silicone O-ring test articles, the new method was compared with the standard method that permitted the downstream pressure to be non-constant atmospheric pressure. The two methods recorded comparable leak rates, but the new method recorded leak rates with significantly lower measurement uncertainty, statistical variance, and test duration. Utilizing this new method in leak rate quantification, projects will reduce cost and schedule, improve test results, and ease interpretation between data sets.
Schukken, Y H; Rauch, B J; Morelli, J
2013-04-01
The objective of this paper was to define standardized protocols for determining the efficacy of a postmilking teat disinfectant following experimental exposure of teats to both Staphylococcus aureus and Streptococcus agalactiae. The standardized protocols describe the selection of cows and herds and define the critical points in performing experimental exposure, performing bacterial culture, evaluating the culture results, and finally performing statistical analyses and reporting of the results. The protocols define both negative control and positive control trials. For negative control trials, the protocol states that an efficacy of reducing new intramammary infections (IMI) of at least 40% is required for a teat disinfectant to be considered effective. For positive control trials, noninferiority to a control disinfectant with a published efficacy of reducing new IMI of at least 70% is required. Sample sizes for both negative and positive control trials are calculated. Positive control trials are expected to require a large trial size. Statistical analysis methods are defined and, in the proposed methods, the rate of IMI may be analyzed using generalized linear mixed models. The efficacy of the test product can be evaluated while controlling for important covariates and confounders in the trial. Finally, standards for reporting are defined and reporting considerations are discussed. The use of the defined protocol is shown through presentation of the results of a recent trial of a test product against a negative control. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Wang, Chun; Zheng, Yi; Chang, Hua-Hua
2014-01-01
With the advent of web-based technology, online testing is becoming a mainstream mode in large-scale educational assessments. Most online tests are administered continuously in a testing window, which may post test security problems because examinees who take the test earlier may share information with those who take the test later. Researchers have proposed various statistical indices to assess the test security, and one most often used index is the average test-overlap rate, which was further generalized to the item pooling index (Chang & Zhang, 2002, 2003). These indices, however, are all defined as the means (that is, the expected proportion of common items among examinees) and they were originally proposed for computerized adaptive testing (CAT). Recently, multistage testing (MST) has become a popular alternative to CAT. The unique features of MST make it important to report not only the mean, but also the standard deviation (SD) of test overlap rate, as we advocate in this paper. The standard deviation of test overlap rate adds important information to the test security profile, because for the same mean, a large SD reflects that certain groups of examinees share more common items than other groups. In this study, we analytically derived the lower bounds of the SD under MST, with the results under CAT as a benchmark. It is shown that when the mean overlap rate is the same between MST and CAT, the SD of test overlap tends to be larger in MST. A simulation study was conducted to provide empirical evidence. We also compared the security of MST under the single-pool versus the multiple-pool designs; both analytical and simulation studies show that the non-overlapping multiple-pool design will slightly increase the security risk.
Influence of valproate on language functions in children with epilepsy.
Doo, Jin Woong; Kim, Soon Chul; Kim, Sun Jun
2018-01-01
The aim of the current study was to assess the influences of valproate (VPA) on the language functions in newly diagnosed pediatric patients with epilepsy. We reviewed medical records of 53 newly diagnosed patients with epilepsy, who were being treated with VPA monotherapy (n=53; 22 male patients and 31 female patients). The subjects underwent standardized language tests, at least twice, before and after the initiation of VPA. The standardized language tests used were The Test of Language Problem Solving Abilities, a Korean version of The Expressive/Receptive Language Function Test, and the Urimal Test of Articulation and Phonology. Since all the patients analyzed spoke Korean as their first language, we used Korean language tests to reduce the bias within the data. All the language parameters of the Test of Language Problem Solving Abilities slightly improved after the initiation of VPA in the 53 pediatric patients with epilepsy (mean age: 11.6±3.2years), but only "prediction" was statistically significant (determining cause, 14.9±5.1 to 15.5±4.3; making inference, 16.1±5.8 to 16.9±5.6; prediction, 11.1±4.9 to 11.9±4.2; total score of TOPS, 42.0±14.4 to 44.2±12.5). The patients treated with VPA also exhibited a small extension in mean length of utterance in words (MLU-w) when responding, but this was not statistically significant (determining cause, 5.4±2.0 to 5.7±1.6; making inference, 5.8±2.2 to 6.0±1.8; prediction, 5.9±2.5 to 5.9±2.1; total, 5.7±2.1 to 5.9±1.7). The administration of VPA led to a slight, but not statistically significant, improvement in the receptive language function (range: 144.7±41.1 to 148.2±39.7). Finally, there were no statistically significant changes in the percentage of articulation performance after taking VPA. Therefore, our data suggested that VPA did not have negative impact on the language function, but rather slightly improved problem-solving abilities. Copyright © 2017 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bohrman, J.S.; Burg, J.R.; Elmore, E.
1988-01-01
Three laboratories participated in an interlaboratory study to evaluate the usefulness of the Chinese hamster V79 cell metabolic cooperation assay to predict the tumor-promoting activity of selected chemical. Twenty-three chemicals of different chemical structures (phorbol esters, barbiturates, phenols, artificial sweeteners, alkanes, and peroxides) were chosen for testing based on in vivo promotion activities, as reported in the literature. Assay protocols and materials were standardized, and the chemicals were coded to facilitate unbiased evaluation. A chemical was tested only once in each laboratory, with one of the three laboratories testing only 15 out of 23 chemicals. Dunnett's test was used formore » statistical analysis. Chemicals were scored as positive (at least two concentration levels statistically different than control), equivocal (only one concentration statistically different), or negative. For 15 chemicals tested in all three laboratories, there was complete agreement among the laboratories for nine chemicals. For the 23 chemicals tested in only two laboratories, there was agreement on 16 chemicals. With the exception of the peroxides and alkanes, the metabolic cooperation data were in general agreement with in vivo data. However, an overall evaluation of the V79 cell system for predicting in vivo promotion activity was difficult because of the organ specificity of certain chemicals and/or the limited number of adequately tested nonpromoting chemicals.« less
NASA Astrophysics Data System (ADS)
Reynolds, John; Sandstrom, Mary; Brown, Geoffrey; Warner, Kirstin; Phillips, Jason; Shelley, Timothy; Reyes, Jose; Hsu, Peter
2013-06-01
One of the first steps in establishing safe handling procedures for explosives is small-scale safety and thermal (SSST) testing. To better understand the response of improvised materials or HMEs to SSST testing, 18 HME materials were compared to 3 standard military explosives in a proficiency-type round robin study among five laboratories--2 DoD and 3 DOE--sponsored by DHS. The testing matrix has been designed to address problems encountered with improvised materials--powder mixtures, liquid suspensions, partially wetted solids, immiscible liquids, and reactive materials. Over 30 issues have been identified that indicate standard test methods may require modification when applied to HMEs to derive accurate sensitivity assessments needed for development safe handling and storage practices. This presentation will discuss experimental difficulties encountered when testing these problematic samples, show inter-laboratory testing results, show some statistical interpretation of the results, and highlight some of the testing issues. Some of the work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. LLNL-ABS-617519 (721812).
Rank score and permutation testing alternatives for regression quantile estimates
Cade, B.S.; Richards, J.D.; Mielke, P.W.
2006-01-01
Performance of quantile rank score tests used for hypothesis testing and constructing confidence intervals for linear quantile regression estimates (0 ≤ τ ≤ 1) were evaluated by simulation for models with p = 2 and 6 predictors, moderate collinearity among predictors, homogeneous and hetero-geneous errors, small to moderate samples (n = 20–300), and central to upper quantiles (0.50–0.99). Test statistics evaluated were the conventional quantile rank score T statistic distributed as χ2 random variable with q degrees of freedom (where q parameters are constrained by H 0:) and an F statistic with its sampling distribution approximated by permutation. The permutation F-test maintained better Type I errors than the T-test for homogeneous error models with smaller n and more extreme quantiles τ. An F distributional approximation of the F statistic provided some improvements in Type I errors over the T-test for models with > 2 parameters, smaller n, and more extreme quantiles but not as much improvement as the permutation approximation. Both rank score tests required weighting to maintain correct Type I errors when heterogeneity under the alternative model increased to 5 standard deviations across the domain of X. A double permutation procedure was developed to provide valid Type I errors for the permutation F-test when null models were forced through the origin. Power was similar for conditions where both T- and F-tests maintained correct Type I errors but the F-test provided some power at smaller n and extreme quantiles when the T-test had no power because of excessively conservative Type I errors. When the double permutation scheme was required for the permutation F-test to maintain valid Type I errors, power was less than for the T-test with decreasing sample size and increasing quantiles. Confidence intervals on parameters and tolerance intervals for future predictions were constructed based on test inversion for an example application relating trout densities to stream channel width:depth.
Corbel, Michael J; Das, Rose Gaines; Lei, Dianliang; Xing, Dorothy K L; Horiuchi, Yoshinobu; Dobbelaer, Roland
2008-04-07
This report reflects the discussion and conclusions of a WHO group of experts from National Regulatory Authorities (NRAs), National Control Laboratories (NCLs), vaccine industries and other relevant institutions involved in standardization and control of diphtheria, tetanus and pertussis vaccines (DTP), held on 20-21 July 2006 and 28-30 March 2007, in Geneva Switzerland for the revision of WHO Manual for quality control of DTP vaccines. Taking into account recent developments and standardization in quality control methods and the revision of WHO recommendations for D, T, P vaccines, and a need for updating the manual has been recognized. In these two meetings the current situation of quality control methods in terms of potency, safety and identity tests for DTP vaccines and statistical analysis of data were reviewed. Based on the WHO recommendations and recent validation of testing methods, the content of current manual were reviewed and discussed. The group agreed that the principles to be observed in selecting methods included identifying those critical for assuring safety, efficacy and quality and which were consistent with WHO recommendations/requirements. Methods that were well recognized but not yet included in current Recommendations should be taken into account. These would include in vivo and/or in vitro methods for determining potency, safety testing and identity. The statistical analysis of the data should be revised and updated. It was noted that the mouse based assays for toxoid potency were still quite widely used and it was desirable to establish appropriate standards for these to enable the results to be related to the standard guinea pig assays. The working group was met again to review the first drafts and to input further suggestions or amendments to the contributions of the drafting groups. The revised manual was to be finalized and published by WHO.
Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario
2014-01-01
Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
Conceptual and statistical problems associated with the use of diversity indices in ecology.
Barrantes, Gilbert; Sandoval, Luis
2009-09-01
Diversity indices, particularly the Shannon-Wiener index, have extensively been used in analyzing patterns of diversity at different geographic and ecological scales. These indices have serious conceptual and statistical problems which make comparisons of species richness or species abundances across communities nearly impossible. There is often no a single statistical method that retains all information needed to answer even a simple question. However, multivariate analyses could be used instead of diversity indices, such as cluster analyses or multiple regressions. More complex multivariate analyses, such as Canonical Correspondence Analysis, provide very valuable information on environmental variables associated to the presence and abundance of the species in a community. In addition, particular hypotheses associated to changes in species richness across localities, or change in abundance of one, or a group of species can be tested using univariate, bivariate, and/or rarefaction statistical tests. The rarefaction method has proved to be robust to standardize all samples to a common size. Even the simplest method as reporting the number of species per taxonomic category possibly provides more information than a diversity index value.
Van Bockstaele, Femke; Janssens, Ann; Piette, Anne; Callewaert, Filip; Pede, Valerie; Offner, Fritz; Verhasselt, Bruno; Philippé, Jan
2006-07-15
ZAP-70 has been proposed as a surrogate marker for immunoglobulin heavy-chain variable region (IgV(H)) mutation status, which is known as a prognostic marker in B-cell chronic lymphocytic leukemia (CLL). The flow cytometric analysis of ZAP-70 suffers from difficulties in standardization and interpretation. We applied the Kolmogorov-Smirnov (KS) statistical test to make analysis more straightforward. We examined ZAP-70 expression by flow cytometry in 53 patients with CLL. Analysis was performed as initially described by Crespo et al. (New England J Med 2003; 348:1764-1775) and alternatively by application of the KS statistical test comparing T cells with B cells. Receiver-operating-characteristics (ROC)-curve analyses were performed to determine the optimal cut-off values for ZAP-70 measured by the two approaches. ZAP-70 protein expression was compared with ZAP-70 mRNA expression measured by a quantitative PCR (qPCR) and with the IgV(H) mutation status. Both flow cytometric analyses correlated well with the molecular technique and proved to be of equal value in predicting the IgV(H) mutation status. Applying the KS test is reproducible, simple, straightforward, and overcomes a number of difficulties encountered in the Crespo-method. The KS statistical test is an essential part of the software delivered with modern routine analytical flow cytometers and is well suited for analysis of ZAP-70 expression in CLL. (c) 2006 International Society for Analytical Cytology.
Fatigue testing of metric bolts fitted with lip-type nuts
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dragoni, E.
This paper addresses the effect of the external shape of lip-type nuts on the fatigue strength of commercial M10 bolts loaded in tension. Evolving from the standard configuration, six nut geometries are compared, characterized by lips of different shape (cylindrical, tapered or both) and length. Testing and statistical treatment of the data are performed in accordance with a JSME standard involving 14 specimens for each geometry. Within the class of merely cylindrical lips, only limited advantages over the standard assembly are detected. In particular, the bolt strength remains mostly unaffected by lengthening of the lip beyond one third of themore » nut height. Conversely, tapering of the lip end so as to thin its wall around the entry section of the bolt results in substantial improvements. In this case, the strength increase is roughly proportional to the taper length. Adoption of a tapered lip covering two thirds of the nut height enhances the bolt strength by about one fourth with respect to the standard geometry.« less
Remittance spectroscopy of human skin in vivo.
Geyer, A; Vilser, W; Karte, K; Wollina, U
1996-08-01
The skin is an easily accessible organ on which non-invasive examination methods can be applied. Recently spectroscopic methods have been introduced in characterization of skin under physiological and pathological conditions. To examine remittance spectroscopic qualities of human skin in vivo and to clarify the influence of selected test conditions, a single-beam spectrometer MCS 410 (Carl Zeiss, Jena, Germany) has been used. Remittance spectra readings were performed in 35 volunteers. Wavelength ranged from 362 nm to 780 nm. Individual remittance values and their standard deviations were obtained from 20 readings under standardized test conditions. The effect of pressure, rubbing, cooling, washing, greasing and degreasing on average remittance values was investigated. Statistical analysis was done with paired Student's f-test and Fisher's test. Pressure increased remittance values over a wide range of wavelength, peaking at 518 nm. Greasing and degreasing modified spectral remittance of shorter wavelength peaking around 362 nm. Rubbing and cooling did not induce significant variations of spectral remittance of skin. Spectral remittance provides an individual profile in human skin, which may be influenced by pressure and greasing/ degreasing. To establish standardized test conditions with a narrow range of intra-individual variation these items have to be kept constant.
NASA Astrophysics Data System (ADS)
Ghannadpour, Seyyed Saeed; Hezarkhani, Ardeshir
2016-03-01
The U-statistic method is one of the most important structural methods to separate the anomaly from the background. It considers the location of samples and carries out the statistical analysis of the data without judging from a geochemical point of view and tries to separate subpopulations and determine anomalous areas. In the present study, to use U-statistic method in three-dimensional (3D) condition, U-statistic is applied on the grade of two ideal test examples, by considering sample Z values (elevation). So far, this is the first time that this method has been applied on a 3D condition. To evaluate the performance of 3D U-statistic method and in order to compare U-statistic with one non-structural method, the method of threshold assessment based on median and standard deviation (MSD method) is applied on the two example tests. Results show that the samples indicated by U-statistic method as anomalous are more regular and involve less dispersion than those indicated by the MSD method. So that, according to the location of anomalous samples, denser areas of them can be determined as promising zones. Moreover, results show that at a threshold of U = 0, the total error of misclassification for U-statistic method is much smaller than the total error of criteria of bar {x}+n× s. Finally, 3D model of two test examples for separating anomaly from background using 3D U-statistic method is provided. The source code for a software program, which was developed in the MATLAB programming language in order to perform the calculations of the 3D U-spatial statistic method, is additionally provided. This software is compatible with all the geochemical varieties and can be used in similar exploration projects.
The repeatability of mean defect with size III and size V standard automated perimetry.
Wall, Michael; Doyle, Carrie K; Zamba, K D; Artes, Paul; Johnson, Chris A
2013-02-15
The mean defect (MD) of the visual field is a global statistical index used to monitor overall visual field change over time. Our goal was to investigate the relationship of MD and its variability for two clinically used strategies (Swedish Interactive Threshold Algorithm [SITA] standard size III and full threshold size V) in glaucoma patients and controls. We tested one eye, at random, for 46 glaucoma patients and 28 ocularly healthy subjects with Humphrey program 24-2 SITA standard for size III and full threshold for size V each five times over a 5-week period. The standard deviation of MD was regressed against the MD for the five repeated tests, and quantile regression was used to show the relationship of variability and MD. A Wilcoxon test was used to compare the standard deviations of the two testing methods following quantile regression. Both types of regression analysis showed increasing variability with increasing visual field damage. Quantile regression showed modestly smaller MD confidence limits. There was a 15% decrease in SD with size V in glaucoma patients (P = 0.10) and a 12% decrease in ocularly healthy subjects (P = 0.08). The repeatability of size V MD appears to be slightly better than size III SITA testing. When using MD to determine visual field progression, a change of 1.5 to 4 decibels (dB) is needed to be outside the normal 95% confidence limits, depending on the size of the stimulus and the amount of visual field damage.
Evaluation of the quality of generic polymethylmethacrylate intraocular lenses marketed in India.
Combe, R; Watkins, R; Brian, G
2001-04-01
To determine the quality of single-piece, allpolymethylmethacrylate (PMMA) Intraocular lenses (IOLs) from eght generic manufacturers marketing their product in India. This assessment of quality was made with respect to compliance with internationa standards for the manufacture of IOLs, specifically those parameters most likely to affect patient postoperat ve visual acuity and the long-term biocompatibility of the implanted lens. Ten IOLs from each of eight manufacturers were purchased randomly from commercial retail outlets in India. Each IOL, in a masked fashion, had its physical dimensions, optical performance and cosmetic appearance assessed, using the methods prescribed in ISO 11979-2 and 11979-3. Validation of manufacturing process controls were determined by statistical process contro techniques. Four IOLs from each manufacturer were also tested for the presence of unpolymerized PMMA using gas chromatography. Only lenses from two IOL manufacturers complied with the optical and mechanical standards. All other manufacturers' lenses failed one or more of these tests. Intraocular lenses from only two producers met with surface quality and bulk homogeneity standards. All others exhibited defects such as surface contamination and scratches, poor polishing, and chipped or rough positioning holes. Lenses from two producers exhibited high levels of methylmethacrylate monomer (MMA). Non-clinical grade PMMA starting material may have been used in the manufacture of IOLs by some producers. Critical manufacturing defects occurred in the IOLs from five of the eight producers tested. Only one manufacturer's IOLs met all specifications, and on statistical analysis demonstrated good manufacturing process contro with respect to the properties tested. With the widespread acceptance of IOL implantation in developing countries, such as India, it is essential that in the rush to make this the norm, the quality of implants used not be overlooked.
Orr, Richard H.; Schless, Arthur P.
1972-01-01
The standardized Document Delivery Tests (DDT's) developed earlier (Bulletin 56: 241-267, July 1968) were employed to assess the capability of ninety-two medical school libraries for meeting the document needs of biomedical researchers, and the capability of fifteen major resource libraries for filling I-L requests from biomedical libraries. The primary test data are summarized as statistics on the observed availability status of the 300 plus documents in the test samples, and as measures expressing capability as a function of the mean time that would be required for users to obtain test sample documents. A mathematical model is developed in which the virtual capability of a library, as seen by its users, equals the algebraic sum of the basic capability afforded by its holdings; the combined losses attributable to use of its collection, processing, relative inacessibility, and housekeeping problems; and the gain realized by coupling with other resources (I-L borrowing). For a particular library, or group of libraries, empirical values for each of these variables can be calculated easily from the capability measures and the status statistics. Regression equations are derived that provide useful predictions of basic capability from collection size. The most important result of this work is that cost-effectiveness analyses can now be used as practical decision aids in managing a basic library service. A program of periodic surveys and further development of DDT's is recommended as appropriate for the Medical Library Association. PMID:5054305
NASA Technical Reports Server (NTRS)
Hemsch, Michael J.
1996-01-01
As part of a continuing effort to re-engineer the wind tunnel testing process, a comprehensive data quality assurance program is being established at NASA Langley Research Center (LaRC). The ultimate goal of the program is routing provision of tunnel-to-tunnel reproducibility with total uncertainty levels acceptable for test and evaluation of civilian transports. The operational elements for reaching such levels of reproducibility are: (1) statistical control, which provides long term measurement uncertainty predictability and a base for continuous improvement, (2) measurement uncertainty prediction, which provides test designs that can meet data quality expectations with the system's predictable variation, and (3) national standards, which provide a means for resolving tunnel-to-tunnel differences. The paper presents the LaRC design for the program and discusses the process of implementation.
A new developmental toxicity test for pelagic fish using anchoveta (Engraulis ringens J.).
Llanos-Rivera, A; Castro, L R; Silva, J; Bay-Schmith, E
2009-07-01
A series of six 96-h static bioassays were performed to validate the use of anchoveta (Engraulis ringens) embryos as test organisms for ecotoxicological studies. The standardization protocol utilized potassium dichromate (K(2)Cr(2)O(7)) as a reference toxicant and egg mortality as the endpoint. The results indicated that the mean sensitivity of anchoveta embryos to potassium dichromate was 156.1 mg L(-1) (range: 131-185 mg L(-1)). The statistical data analysis showed high homogeneity in LC50 values among bioassays (variation coefficient = 11.02%). These results demonstrated that the protocol and handling procedures implemented for the anchoveta embryo bioassays comply with international standards for intra-laboratory precision. After secondary treatment, an effluent from a modern Kraft pulp mill was tested for E. ringens embryo toxicity, finding no significant differences from the controls.
Kindergarten Predictors of Math Learning Disability
Mazzocco, Michèle M. M.; Thompson, Richard E.
2009-01-01
The aim of the present study was to address how to effectively predict mathematics learning disability (MLD). Specifically, we addressed whether cognitive data obtained during kindergarten can effectively predict which children will have MLD in third grade, whether an abbreviated test battery could be as effective as a standard psychoeducational assessment at predicting MLD, and whether the abbreviated battery corresponded to the literature on MLD characteristics. Participants were 226 children who enrolled in a 4-year prospective longitudinal study during kindergarten. We administered measures of mathematics achievement, formal and informal mathematics ability, visual-spatial reasoning, and rapid automatized naming and examined which test scores and test items from kindergarten best predicted MLD at grades 2 and 3. Statistical models using standardized scores from the entire test battery correctly classified ~80–83 percent of the participants as having, or not having, MLD. Regression models using scores from only individual test items were less predictive than models containing the standard scores, except for models using a specific subset of test items that dealt with reading numerals, number constancy, magnitude judgments of one-digit numbers, or mental addition of one-digit numbers. These models were as accurate in predicting MLD as was the model including the entire set of standard scores from the battery of tests examined. Our findings indicate that it is possible to effectively predict which kindergartners are at risk for MLD, and thus the findings have implications for early screening of MLD. PMID:20084182
Phu, Jack; Bui, Bang V; Kalloniatis, Michael; Khuu, Sieu K
2018-03-01
The number of subjects needed to establish the normative limits for visual field (VF) testing is not known. Using bootstrap resampling, we determined whether the ground truth mean, distribution limits, and standard deviation (SD) could be approximated using different set size ( x ) levels, in order to provide guidance for the number of healthy subjects required to obtain robust VF normative data. We analyzed the 500 Humphrey Field Analyzer (HFA) SITA-Standard results of 116 healthy subjects and 100 HFA full threshold results of 100 psychophysically experienced healthy subjects. These VFs were resampled (bootstrapped) to determine mean sensitivity, distribution limits (5th and 95th percentiles), and SD for different ' x ' and numbers of resamples. We also used the VF results of 122 glaucoma patients to determine the performance of ground truth and bootstrapped results in identifying and quantifying VF defects. An x of 150 (for SITA-Standard) and 60 (for full threshold) produced bootstrapped descriptive statistics that were no longer different to the original distribution limits and SD. Removing outliers produced similar results. Differences between original and bootstrapped limits in detecting glaucomatous defects were minimized at x = 250. Ground truth statistics of VF sensitivities could be approximated using set sizes that are significantly smaller than the original cohort. Outlier removal facilitates the use of Gaussian statistics and does not significantly affect the distribution limits. We provide guidance for choosing the cohort size for different levels of error when performing normative comparisons with glaucoma patients.
Exocrine Dysfunction Correlates with Endocrinal Impairment of Pancreas in Type 2 Diabetes Mellitus.
Prasanna Kumar, H R; Gowdappa, H Basavana; Hosmani, Tejashwi; Urs, Tejashri
2018-01-01
Diabetes mellitus (DM) is a chronic abnormal metabolic condition, which manifests elevated blood sugar level over a prolonged period. The pancreatic endocrine system generally gets affected during diabetes, but often abnormal exocrine functions are also manifested due to its proximity to the endocrine system. Fecal elastase-1 (FE-1) is found to be an ideal biomarker to reflect the exocrine insufficiency of the pancreas. The aim of this study was conducted to assess exocrine dysfunction of the pancreas in patients with type-2 DM (T2DM) by measuring FE levels and to associate the level of hyperglycemia with exocrine pancreatic dysfunction. A prospective, cross-sectional comparative study was conducted on both T2DM patients and healthy nondiabetic volunteers. FE-1 levels were measured using a commercial kit (Human Pancreatic Elastase ELISA BS 86-01 from Bioserv Diagnostics). Data analysis was performed based on the important statistical parameters such as mean, standard deviation, standard error, t -test-independent samples, and Chi-square test/cross tabulation using SPSS for Windows version 20.0. Statistically nonsignificant ( P = 0.5051) relationship between FE-1 deficiency and age was obtained, which implied age as a noncontributing factor toward exocrine pancreatic insufficiency among diabetic patients. Statistically significant correlation ( P = 0.003) between glycated hemoglobin and FE-1 levels was also noted. The association between retinopathy ( P = 0.001) and peripheral pulses ( P = 0.001) with FE-1 levels were found to be statistically significant. This study validates the benefit of FE-1 estimation, as a surrogate marker of exocrine pancreatic insufficiency, which remains unmanifest and subclinical.
[Therapy of organic brain syndrome with nicergoline given once a day].
Ladurner, G; Erhart, P; Erhart, C; Scheiber, V
1991-01-01
In a double-blind, active-controlled study 30 patients with mild to moderate multiinfarct dementia diagnosed according to DSM III definition were treated by either 20 mg nicergoline or 4.5 mg co-dergocrine mesilate once daily during eight weeks. Therapeutic effects on symptoms of the organic brain syndrome were quantitatively measured by standardized psychological and psychometric methods evaluating cognitive and thymopsychic functions. Main criteria, which were tested by inferential analysis, were SCAG total score (Sandoz Clinical Assessment Geriatric Scale), SCAG overall impression and the AD Test (alphabetischer Durchstreichtest). Other results were assessed by descriptive statistics. Both treatments resulted in a statistically significant improvement in most of the tested functions. The effects of 4.5 mg co-dergocrine mesilate s.i.d. were in accordance with published results. Although differing slightly with respect to individual results 20 mg of nicergoline once daily showed the same efficacy on the whole.
NASA Astrophysics Data System (ADS)
Abdellatef, Hisham E.
2007-04-01
Picric acid, bromocresol green, bromothymol blue, cobalt thiocyanate and molybdenum(V) thiocyanate have been tested as spectrophotometric reagents for the determination of disopyramide and irbesartan. Reaction conditions have been optimized to obtain coloured comoplexes of higher sensitivity and longer stability. The absorbance of ion-pair complexes formed were found to increases linearity with increases in concentrations of disopyramide and irbesartan which were corroborated by correction coefficient values. The developed methods have been successfully applied for the determination of disopyramide and irbesartan in bulk drugs and pharmaceutical formulations. The common excipients and additives did not interfere in their determination. The results obtained by the proposed methods have been statistically compared by means of student t-test and by the variance ratio F-test. The validity was assessed by applying the standard addition technique. The results were compared statistically with the official or reference methods showing a good agreement with high precision and accuracy.
Establishing the traceability of a uranyl nitrate solution to a standard reference material
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, C.H.; Clark, J.P.
1978-01-01
A uranyl nitrate solution for use as a Working Calibration and Test Material (WCTM) was characterized, using a statistically designed procedure to document traceability to National Bureau of Standards Reference Material (SPM-960). A Reference Calibration and Test Material (PCTM) was prepared from SRM-960 uranium metal to approximate the acid and uranium concentration of the WCTM. This solution was used in the characterization procedure. Details of preparing, handling, and packaging these solutions are covered. Two outside laboratories, each having measurement expertise using a different analytical method, were selected to measure both solutions according to the procedure for characterizing the WCTM. Twomore » different methods were also used for the in-house characterization work. All analytical results were tested for statistical agreement before the WCTM concentration and limit of error values were calculated. A concentration value was determined with a relative limit of error (RLE) of approximately 0.03% which was better than the target RLE of 0.08%. The use of this working material eliminates the expense of using SRMs to fulfill traceability requirements for uranium measurements on this type material. Several years' supply of uranyl nitrate solution with NBS traceability was produced. The cost of this material was less than 10% of an equal quantity of SRM-960 uranium metal.« less
Schröder, Christian; Steinbrück, Arnd; Müller, Tatjana; Woiczinski, Matthias; Chevalier, Yan; Müller, Peter E.; Jansson, Volkmar
2015-01-01
Retropatellar complications after total knee arthroplasty (TKA) such as anterior knee pain and subluxations might be related to altered patellofemoral biomechanics, in particular to trochlear design and femorotibial joint positioning. A method was developed to test femorotibial and patellofemoral joint modifications separately with 3D-rapid prototyped components for in vitro tests, but material differences may further influence results. This pilot study aims at validating the use of prostheses made of photopolymerized rapid prototype material (RPM) by measuring the sliding friction with a ring-on-disc setup as well as knee kinematics and retropatellar pressure on a knee rig. Cobalt-chromium alloy (standard prosthesis material, SPM) prostheses served as validation standard. Friction coefficients between these materials and polytetrafluoroethylene (PTFE) were additionally tested as this latter material is commonly used to protect pressure sensors in experiments. No statistical differences were found between friction coefficients of both materials to PTFE. UHMWPE shows higher friction coefficient at low axial loads for RPM, a difference that disappears at higher load. No measurable statistical differences were found in knee kinematics and retropatellar pressure distribution. This suggests that using polymer prototypes may be a valid alternative to original components for in vitro TKA studies and future investigations on knee biomechanics. PMID:25879019
Farwell, Lawrence A; Richardson, Drew C; Richardson, Graham M
2013-08-01
Brain fingerprinting detects concealed information stored in the brain by measuring brainwave responses. We compared P300 and P300-MERMER event-related brain potentials for error rate/accuracy and statistical confidence in four field/real-life studies. 76 tests detected presence or absence of information regarding (1) real-life events including felony crimes; (2) real crimes with substantial consequences (either a judicial outcome, i.e., evidence admitted in court, or a $100,000 reward for beating the test); (3) knowledge unique to FBI agents; and (4) knowledge unique to explosives (EOD/IED) experts. With both P300 and P300-MERMER, error rate was 0 %: determinations were 100 % accurate, no false negatives or false positives; also no indeterminates. Countermeasures had no effect. Median statistical confidence for determinations was 99.9 % with P300-MERMER and 99.6 % with P300. Brain fingerprinting methods and scientific standards for laboratory and field applications are discussed. Major differences in methods that produce different results are identified. Markedly different methods in other studies have produced over 10 times higher error rates and markedly lower statistical confidences than those of these, our previous studies, and independent replications. Data support the hypothesis that accuracy, reliability, and validity depend on following the brain fingerprinting scientific standards outlined herein.
NASA Astrophysics Data System (ADS)
Mori, Kaya; Chonko, James C.; Hailey, Charles J.
2005-10-01
We have reanalyzed the 260 ks XMM-Newton observation of 1E 1207.4-5209. There are several significant improvements over previous work. First, a much broader range of physically plausible spectral models was used. Second, we have used a more rigorous statistical analysis. The standard F-distribution was not employed, but rather the exact finite statistics F-distribution was determined by Monte Carlo simulations. This approach was motivated by the recent work of Protassov and coworkers and Freeman and coworkers. They demonstrated that the standard F-distribution is not even asymptotically correct when applied to assess the significance of additional absorption features in a spectrum. With our improved analysis we do not find a third and fourth spectral feature in 1E 1207.4-5209 but only the two broad absorption features previously reported. Two additional statistical tests, one line model dependent and the other line model independent, confirmed our modified F-test analysis. For all physically plausible continuum models in which the weak residuals are strong enough to fit, the residuals occur at the instrument Au M edge. As a sanity check we confirmed that the residuals are consistent in strength and position with the instrument Au M residuals observed in 3C 273.
Mueck, F G; Michael, L; Deak, Z; Scherr, M K; Maxien, D; Geyer, L L; Reiser, M; Wirth, S
2013-07-01
To compare the image quality in dose-reduced 64-row CT of the chest at different levels of adaptive statistical iterative reconstruction (ASIR) to full-dose baseline examinations reconstructed solely with filtered back projection (FBP) in a realistic upgrade scenario. A waiver of consent was granted by the institutional review board (IRB). The noise index (NI) relates to the standard deviation of Hounsfield units in a water phantom. Baseline exams of the chest (NI = 29; LightSpeed VCT XT, GE Healthcare) were intra-individually compared to follow-up studies on a CT with ASIR after system upgrade (NI = 45; Discovery HD750, GE Healthcare), n = 46. Images were calculated in slice and volume mode with ASIR levels of 0 - 100 % in the standard and lung kernel. Three radiologists independently compared the image quality to the corresponding full-dose baseline examinations (-2: diagnostically inferior, -1: inferior, 0: equal, + 1: superior, + 2: diagnostically superior). Statistical analysis used Wilcoxon's test, Mann-Whitney U test and the intraclass correlation coefficient (ICC). The mean CTDIvol decreased by 53 % from the FBP baseline to 8.0 ± 2.3 mGy for ASIR follow-ups; p < 0.001. The ICC was 0.70. Regarding the standard kernel, the image quality in dose-reduced studies was comparable to the baseline at ASIR 70 % in volume mode (-0.07 ± 0.29, p = 0.29). Concerning the lung kernel, every ASIR level outperformed the baseline image quality (p < 0.001), with ASIR 30 % rated best (slice: 0.70 ± 0.6, volume: 0.74 ± 0.61). Vendors' recommendation of 50 % ASIR is fair. In detail, the ASIR 70 % in volume mode for the standard kernel and ASIR 30 % for the lung kernel performed best, allowing for a dose reduction of approximately 50 %. © Georg Thieme Verlag KG Stuttgart · New York.
A rule-based software test data generator
NASA Technical Reports Server (NTRS)
Deason, William H.; Brown, David B.; Chang, Kai-Hsiung; Cross, James H., II
1991-01-01
Rule-based software test data generation is proposed as an alternative to either path/predicate analysis or random data generation. A prototype rule-based test data generator for Ada programs is constructed and compared to a random test data generator. Four Ada procedures are used in the comparison. Approximately 2000 rule-based test cases and 100,000 randomly generated test cases are automatically generated and executed. The success of the two methods is compared using standard coverage metrics. Simple statistical tests showing that even the primitive rule-based test data generation prototype is significantly better than random data generation are performed. This result demonstrates that rule-based test data generation is feasible and shows great promise in assisting test engineers, especially when the rule base is developed further.
Byun, Seung Won; Park, Yeon Joon; Hur, Soo Young
2016-04-01
The aim of this study was to compare Affirm VPIII Microbial Identification Test results for Korean women to those obtained for Gardnerella vaginalis through Nugent score, Candida albicans based on vaginal culture and Trichomonas vaginalis based on wet smear diagnostic standards. Study participants included 195 women with symptomatic or asymptomatic vulvovaginitis under hospital obstetric or gynecologic care. A definite diagnosis was made based on Nugent score for Gardnerella, vaginal culture for Candida and wet prep for Trichomonas vaginalis. Affirm VPIII Microbial Identification Test results were then compared to diagnostic standard results. Of the 195 participants, 152 were symptomatic, while 43 were asymptomatic. Final diagnosis revealed 68 (37.87%) cases of Gardnerella, 29 (14.87%) cases of Candida, one (0.51%) case of Trichomonas, and 10 (5.10%) cases of mixed infections. The detection rates achieved by each detection method (Affirm assay vs diagnostic standard) for Gardnerella and Candida were not significantly different (33.33% vs 34.8% for Gardnerella, 13.33% vs 14.87% for Candida, respectively). The sensitivity and specificity of the Affirm test for Gardnerella compared to the diagnostic standard were 75.0% and 88.98%, respectively. For Candida, the sensitivity and specificity of the Affirm test compared to the diagnostic standard were 82.76% and 98.80%, respectively. The number of Trichomonas cases was too small (1 case) to be statistically analyzed. The Affirm test is a quick tool that can help physicians diagnose and treat patients with infectious vaginitis at the point of care. © 2016 Japan Society of Obstetrics and Gynecology.
Physical properties and comparative strength of a bioactive luting cement.
Jefferies, Steven; Lööf, Jesper; Pameijer, Cornelis H; Boston, Daniel; Galbraith, Colin; Hermansson, Leif
2013-01-01
New dental cement formulations require testing to determine physical and mechanical laboratory properties. To test an experimental calcium aluminate/glass-ionomer cement, Ceramir C and B (CC and B), regarding compressive strength (CS), film thickness (FT), net setting time (ST) and Vickers hardness. An additional test to evaluate potential dimensional change/expansion properties of this cement was also conducted. CS was measured according to a slightly modified ISO 9917:2003 for the CC and B specimens. The samples were not clamped while being exposed to relative humidity of great than 90 percent at 37 degrees C for 10 minutes before being stored in phosphate-buffered saline at 37 degrees C. For the CS, four groups were tested: Group 1-CC and B; Group 2-RelyX Luting Cement; Group 3-Fuji Plus; and Group 4-RelyX Unicem. Samples from all groups were stored for 24 hours before testing. Only CCandB was tested for ST and FT according to ISO 9917:2003. The FT was tested 2 minutes after mixing. Vickers hardness was evaluated using the CSM Microhardness Indentation Tester using zinc phosphate cement as a comparison material. Expansion testing included evaluating potential cracks in feldspathic porcelain jacket crowns (PJCs). The mean and standard deviation after 24 hours were expressed in MPa: Group 1 equals 160 plus or equal to 27; Group 2 equals 96 plus or equal to 10; Group 3 equals 138 plus or equal to 15; Group 4 equals 157 plus or equal to 10. A single-factor ANOVA demonstrated statistically significant differences between the groups (P less than 0.001). Pair-wise statistical comparison demonstrated a statistically significant difference between Groups 1 and 2. No statistically significant differences were found between other groups. The FT was 16.8 plus or equal to 0.9 and the ST was 4.8 plus or equal to 0.1 min. Vickers hardness for Ceramir C and B was 68.3 plus or equal to 17.2 and was statistically significantly higher (P less than 0.05) than Fleck's Zinc Phosphate cement at Vickers hardness of 51.4 plus or equal to 10. There was no evidence of cracks due to radial expansion in PJCs by the Ceramir C and B cement. All luting cements tested demonstrated compressive strengths well in excess of the ISO requirement for water-based cements of no less than 50 MPa. Ceramir C and B showed significantly higher CS than RelyX Luting Cement after 24 hours, but was not significantly higher than either Fuji Plus or RelyX Unicem. The ST and FT values of CC and B conform to and are within the boundaries of the requirements of the standard. Surface hardness was statistically higher than and comparable to zinc phosphate cement. There was no evidence of potentially clinically significant and deleterious expansion behavior by this cement. All cements tested demonstrated acceptable strength properties. Within the limits of this study, Ceramir C and B is deemed to possess physical properties suitable for a dental luting cement.
Repeatability Modeling for Wind-Tunnel Measurements: Results for Three Langley Facilities
NASA Technical Reports Server (NTRS)
Hemsch, Michael J.; Houlden, Heather P.
2014-01-01
Data from extensive check standard tests of seven measurement processes in three NASA Langley Research Center wind tunnels are statistically analyzed to test a simple model previously presented in 2000 for characterizing short-term, within-test and across-test repeatability. The analysis is intended to support process improvement and development of uncertainty models for the measurements. The analysis suggests that the repeatability can be estimated adequately as a function of only the test section dynamic pressure over a two-orders- of-magnitude dynamic pressure range. As expected for low instrument loading, short-term coefficient repeatability is determined by the resolution of the instrument alone (air off). However, as previously pointed out, for the highest dynamic pressure range the coefficient repeatability appears to be independent of dynamic pressure, thus presenting a lower floor for the standard deviation for all three time frames. The simple repeatability model is shown to be adequate for all of the cases presented and for all three time frames.
Ko, Wen-Ru; Hung, Wei-Te; Chang, Hui-Chin; Lin, Long-Yau
2014-03-01
The study was designed to investigate the frequency of misusing standard error of the mean (SEM) in place of standard deviation (SD) to describe study samples in four selected journals published in 2011. Citation counts of articles and the relationship between the misuse rate and impact factor, immediacy index, or cited half-life were also evaluated. All original articles in the four selected journals published in 2011 were searched for descriptive statistics reporting with either mean ± SD or mean ± SEM. The impact factor, immediacy index, and cited half-life of the journals were gathered from Journal Citation Reports Science edition 2011. Scopus was used to search for citations of individual articles. The difference in citation counts between the SD group and SEM group was tested by the Mann-Whitney U test. The relationship between the misuse rate and impact factor, immediacy index, or cited half-life was also evaluated. The frequency of inappropriate reporting of SEM was 13.60% for all four journals. For individual journals, the misuse rate was from 2.9% in Acta Obstetricia et Gynecologica Scandinavica to 22.68% in American Journal of Obstetrics & Gynecology. Articles using SEM were cited more frequently than those using SD (p = 0.025). An approximate positive correlation between the misuse rate and cited half-life was observed. Inappropriate reporting of SEM is common in medical journals. Authors of biomedical papers should be responsible for maintaining an integrated statistical presentation because valuable articles are in danger of being wasted through the misuse of statistics. Copyright © 2014. Published by Elsevier B.V.
Catlin, Anita; Taylor-Ford, Rebecca L
2011-05-01
To determine whether provision of Reiki therapy during outpatient chemotherapy is associated with increased comfort and well-being. Double-blind, randomized clinical controlled trial. Outpatient chemotherapy center. 189 participants were randomized to actual Reiki, sham Reiki placebo, or standard care. Patients receiving chemotherapy were randomly placed into one of three groups. Patients received either standard care, a placebo, or an actual Reiki therapy treatment. A demographic tool and pre- and post-tests were given before and after chemotherapy infusion. Reiki therapy, sham Reiki placebo therapy, standard care, and self-reported levels of comfort and well-being pre- and postintervention. Although Reiki therapy was statistically significant in raising the comfort and well-being of patients post-therapy, the sham Reiki placebo also was statistically significant. Patients in the standard care group did not experience changes in comfort and well-being during their infusion session. The findings indicate that the presence of an RN providing one-on-one support during chemotherapy was influential in raising comfort and well-being levels, with or without an attempted healing energy field. An attempt by clinic nurses to provide more designated one-to-one presence and support for patients while receiving their chemotherapy infusions could increase patient comfort and well-being.
Re-Analysis Report: Daylighting in Schools, Additional Analysis. Tasks 2.2.1 through 2.2.5.
ERIC Educational Resources Information Center
Heschong, Lisa; Elzeyadi, Ihab; Knecht, Carey
This study expands and validates previous research that found a statistical correlation between the amount of daylight in elementary school classrooms and the performance of students on standardized math and reading tests. The researchers reanalyzed the 19971998 school year student performance data from the Capistrano Unified School District…
ERIC Educational Resources Information Center
Fouladi, Rachel T.
2000-01-01
Provides an overview of standard and modified normal theory and asymptotically distribution-free covariance and correlation structure analysis techniques and details Monte Carlo simulation results on Type I and Type II error control. Demonstrates through the simulation that robustness and nonrobustness of structure analysis techniques vary as a…
Limits on the Accuracy of Linking. Research Report. ETS RR-10-22
ERIC Educational Resources Information Center
Haberman, Shelby J.
2010-01-01
Sampling errors limit the accuracy with which forms can be linked. Limitations on accuracy are especially important in testing programs in which a very large number of forms are employed. Standard inequalities in mathematical statistics may be used to establish lower bounds on the achievable inking accuracy. To illustrate results, a variety of…
Creating Realistic Data Sets with Specified Properties via Simulation
ERIC Educational Resources Information Center
Goldman, Robert N.; McKenzie, John D. Jr.
2009-01-01
We explain how to simulate both univariate and bivariate raw data sets having specified values for common summary statistics. The first example illustrates how to "construct" a data set having prescribed values for the mean and the standard deviation--for a one-sample t test with a specified outcome. The second shows how to create a bivariate data…
Hadlich, Marcelo Souza; Oliveira, Gláucia Maria Moraes; Feijóo, Raúl A; Azevedo, Clerio F; Tura, Bernardo Rangel; Ziemer, Paulo Gustavo Portela; Blanco, Pablo Javier; Pina, Gustavo; Meira, Márcio; Souza e Silva, Nelson Albuquerque de
2012-10-01
The standardization of images used in Medicine in 1993 was performed using the DICOM (Digital Imaging and Communications in Medicine) standard. Several tests use this standard and it is increasingly necessary to design software applications capable of handling this type of image; however, these software applications are not usually free and open-source, and this fact hinders their adjustment to most diverse interests. To develop and validate a free and open-source software application capable of handling DICOM coronary computed tomography angiography images. We developed and tested the ImageLab software in the evaluation of 100 tests randomly selected from a database. We carried out 600 tests divided between two observers using ImageLab and another software sold with Philips Brilliance computed tomography appliances in the evaluation of coronary lesions and plaques around the left main coronary artery (LMCA) and the anterior descending artery (ADA). To evaluate intraobserver, interobserver and intersoftware agreements, we used simple and kappa statistics agreements. The agreements observed between software applications were generally classified as substantial or almost perfect in most comparisons. The ImageLab software agreed with the Philips software in the evaluation of coronary computed tomography angiography tests, especially in patients without lesions, with lesions < 50% in the LMCA and < 70% in the ADA. The agreement for lesions > 70% in the ADA was lower, but this is also observed when the anatomical reference standard is used.
Karzmark, Peter; Deutsch, Gayle K
2018-01-01
This investigation was designed to determine the predictive accuracy of a comprehensive neuropsychological and brief neuropsychological test battery with regard to the capacity to perform instrumental activities of daily living (IADLs). Accuracy statistics that included measures of sensitivity, specificity, positive and negative predicted power and positive likelihood ratio were calculated for both types of batteries. The sample was drawn from a general neurological group of adults (n = 117) that included a number of older participants (age >55; n = 38). Standardized neuropsychological assessments were administered to all participants and were comprised of the Halstead Reitan Battery and portions of the Wechsler Adult Intelligence Scale-III. A comprehensive test battery yielded a moderate increase over base-rate in predictive accuracy that generalized to older individuals. There was only limited support for using a brief battery, for although sensitivity was high, specificity was low. We found that a comprehensive neuropsychological test battery provided good classification accuracy for predicting IADL capacity.
[The mitral valve prolapse syndrome in children and adolescents].
Malcić, I; Zavrsnik, J; Kancler, K; Kokol, P
1998-01-01
The authors studied the prevalence of mitral valve prolapse (MVP) in the group of 656 children and adolescents (329 males and 327 females), who were a representative sample (obtained with the Monte Carlo method of statistical trials) of all newborns in the city of Maribor, Republic of Slovenia, in the period of 18 years (1976-1992). The results were considered positive in children and adolescents who in addition to possible history (chest pain, palpitations, dizziness, loss of consciousness, headaches, perspiration), probable auscultatory finding (mezzosystolic click and late systolic murmur), and suspected phonocardiographic and ECG findings, also had a positive M-mode echocardiographic finding. The criteria for MVP on M-mode echocardiography were taken from the literature: descending of mitral cusp, either anterior or posterior, of at least 3 mm below the line connecting points C and D. Children and adolescents were divided into six age groups (infants, toddlers, preschool children, early school age, children in puberty, adolescents). Assuming MVP as a cause of cardiac arrhythmias, beside standard ECG we also performed holter ECG monitoring in 61 children and adolescents (29 with MVP, 32 without MVP). The results were tested with standard statistical tools (chi 2-test, Student t-test, 2 x 2 Fisher chi 2-test). MVP was found in 71 patients (10.8%, 32 males and 39 females). As regards age and sex we found lower prevalence of MVP in male children (9.7%) compared to female children (11.9%). The highest prevalence was found in early school age, more so in females (14.2 vs 13.7). The differences were not statistically significant (p > 0.05). In both sexes most frequent was endosystolic prolapse (males 59.3%, females 51.3%). Most commonly both cusps are involved in the prolapse (males 78.1%, females 66.7%). Most frequently measured descending of the cusps was 3-4.5 mm (males 56.2%, females 48.7%). Negative auscultatory finding (silent MVP) was detected in 47.8% of the patients with MVP. Most patients with diagnosed MVP had no symptoms (71.8%). The prevalence of asymptomatic MVP declines with age in both sexes. The prevalence of arrhythmias, both in standard ECG and holter ECG, is higher in patients with MVP (6.8:0%--NS and 44.6%:9.3%--p < 0.05). The influence of constitutional changes (dolichostenomelia, asthenic constitution, genua valga) on the appearance of MVP is reflected in statistically significant difference in the Rohr' index in the group of patients with MVP in relation to the healthy group (p < 0.05). The higher prevalence of headache and dizziness in the group with MVP is statistically significant (p < 0.05).
Singla, Sanjeev; Mittal, Geeta; Raghav; Mittal, Rajinder K
2014-01-01
Background: Abdominal pain and shoulder tip pain after laparoscopic cholecystectomy are distressing for the patient. Various causes of this pain are peritoneal stretching and diaphragmatic irritation by high intra-abdominal pressure caused by pneumoperitoneum . We designed a study to compare the post operative pain after laparoscopic cholecystectomy at low pressure (7-8 mm of Hg) and standard pressure technique (12-14 mm of Hg). Aim : To compare the effect of low pressure and standard pressure pneumoperitoneum in post laparoscopic cholecystectomy pain . Further to study the safety of low pressure pneumoperitoneum in laparoscopic cholecystectomy. Settings and Design: A prospective randomised double blind study. Materials and Methods: A prospective randomised double blind study was done in 100 ASA grade I & II patients. They were divided into two groups -50 each. Group A patients underwent laparoscopic cholecystectomy with low pressure pneumoperitoneum (7-8 mm Hg) while group B underwent laparoscopic cholecystectomy with standard pressure pneumoperitoneum (12-13 mm Hg). Both the groups were compared for pain intensity, analgesic requirement and complications. Statistical Analysis: Demographic data and intraoperative complications were analysed using chi-square test. Frequency of pain, intensity of pain and analgesics consumption was compared by applying ANOVA test. Results: Post-operative pain score was significantly less in low pressure group as compared to standard pressure group. Number of patients requiring rescue analgesic doses was more in standard pressure group . This was statistically significant. Also total analgesic consumption was more in standard pressure group. There was no difference in intraoperative complications. Conclusion: This study demonstrates the use of simple expedient of reducing the pressure of pneumoperitoneum to 8 mm results in reduction in both intensity and frequency of post-operative pain and hence early recovery and better outcome.This study also shows that low pressure technique is safe with comparable rate of intraoperative complications. PMID:24701492
Mechanical properties and radiopacity of experimental glass-silica-metal hybrid composites.
Jandt, Klaus D; Al-Jasser, Abdullah M O; Al-Ateeq, Khalid; Vowles, Richard W; Allen, Geoff C
2002-09-01
Experimental glass-silica-metal hybrid composites (polycomposites) were developed and tested mechanically and radiographically in this fundamental pilot study. To determine whether mechanical properties of a glass-silica filled two-paste dental composite based on a Bis-GMA/polyglycol dimethacrylate blend could be improved through the incorporation of titanium (Ti) particles (particle size ranging from 1 to 3 microm) or silver-tin-copper (Ag-Sn-Cu) particles (particle size ranging from 1 to 50 microm) we measured the diametral tensile strength, fracture toughness and radiopacity of five composites. The five materials were: I, the original unmodified composite (control group); II, as group I but containing 5% (wt/wt) of Ti particles; III, as group II but with Ti particles treated with 4-methacryloyloxyethyl trimellitate anhydride (4-META) to promote Ti-resin bonding; IV, as group I but containing 5% (wt/wt) of Ag-Sn-Cu particles; and V, as group IV but with the metal particles treated with 4-META. Ten specimens of each group were tested in a standard diametral tensile strength test and a fracture toughness test using a single-edge notched sample design and five specimens of each group were tested using a radiopacity test. The diametral tensile strength increased statistically significantly after incorporation of Ti treated with 4-META, as tested by ANOVA (P=0.004) and Fisher's LSD test. A statistically significant increase of fracture toughness was observed between the control group and groups II, III and V as tested by ANOVA (P=0.003) and Fisher's LSD test. All other groups showed no statistically significant increase in diametral tensile strength and fracture toughness respectively when compared to their control groups. No statistically significant increase in radiopacity was found between the control group and the Ti filled composite, whereas a statistically significant increase in radiopacity was found between the control group and the Ag-Sn-Cu filled composite as tested by ANOVA (P=0.000) and Fisher's LSD procedure. The introduction of titanium and silver-tin-copper fillers has potential as added components in composites to provide increased mechanical strength and radiopacity, for example for use in core materials.
Abbreviated Combined MR Protocol: A New Faster Strategy for Characterizing Breast Lesions.
Moschetta, Marco; Telegrafo, Michele; Rella, Leonarda; Stabile Ianora, Amato Antonio; Angelelli, Giuseppe
2016-06-01
The use of an abbreviated magnetic resonance (MR) protocol has been recently proposed for cancer screening. The aim of our study is to evaluate the diagnostic accuracy of an abbreviated MR protocol combining short TI inversion recovery (STIR), turbo-spin-echo (TSE)-T2 sequences, a pre-contrast T1, and a single intermediate (3 minutes after contrast injection) post-contrast T1 sequence for characterizing breast lesions. A total of 470 patients underwent breast MR examination for screening, problem solving, or preoperative staging. Two experienced radiologists evaluated both standard and abbreviated protocols in consensus. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and diagnostic accuracy for both protocols were calculated (with the histological findings and 6-month ultrasound follow-up as the reference standard) and compared with the McNemar test. The post-processing and interpretation times for the MR images were compared with the paired t test. In 177 of 470 (38%) patients, the MR sequences detected 185 breast lesions. Standard and abbreviated protocols obtained sensitivity, specificity, diagnostic accuracy, PPV, and NPV values respectively of 92%, 92%, 92%, 68%, and 98% and of 89%, 91%, 91%, 64%, and 98% with no statistically significant difference (P < .0001). The mean post-processing and interpretation time were, respectively, 7 ± 1 minutes and 6 ± 3.2 minutes for the standard protocol and 1 ± 1.2 minutes and 2 ± 1.2 minutes for the abbreviated protocol, with a statistically significant difference (P < .01). An abbreviated combined MR protocol represents a time-saving tool for radiologists and patients with the same diagnostic potential as the standard protocol in patients undergoing breast MRI for screening, problem solving, or preoperative staging. Copyright © 2016 Elsevier Inc. All rights reserved.
Long-term changes (1980-2003) in total ozone time series over Northern Hemisphere midlatitudes
NASA Astrophysics Data System (ADS)
Białek, Małgorzata
2006-03-01
Long-term changes in total ozone time series for Arosa, Belsk, Boulder and Sapporo stations are examined. For each station we analyze time series of the following statistical characteristics of the distribution of daily ozone data: seasonal mean, standard deviation, maximum and minimum of total daily ozone values for all seasons. The iterative statistical model is proposed to estimate trends and long-term changes in the statistical distribution of the daily total ozone data. The trends are calculated for the period 1980-2003. We observe lessening of negative trends in the seasonal means as compared to those calculated by WMO for 1980-2000. We discuss a possibility of a change of the distribution shape of ozone daily data using the Kolmogorov-Smirnov test and comparing trend values in the seasonal mean, standard deviation, maximum and minimum time series for the selected stations and seasons. The distribution shift toward lower values without a change in the distribution shape is suggested with the following exceptions: the spreading of the distribution toward lower values for Belsk during winter and no decisive result for Sapporo and Boulder in summer.
Preiksaitis, J.; Tong, Y.; Pang, X.; Sun, Y.; Tang, L.; Cook, L.; Pounds, S.; Fryer, J.; Caliendo, A. M.
2015-01-01
Quantitative detection of cytomegalovirus (CMV) DNA has become a standard part of care for many groups of immunocompromised patients; recent development of the first WHO international standard for human CMV DNA has raised hopes of reducing interlaboratory variability of results. Commutability of reference material has been shown to be necessary if such material is to reduce variability among laboratories. Here we evaluated the commutability of the WHO standard using 10 different real-time quantitative CMV PCR assays run by eight different laboratories. Test panels, including aliquots of 50 patient samples (40 positive samples and 10 negative samples) and lyophilized CMV standard, were run, with each testing center using its own quantitative calibrators, reagents, and nucleic acid extraction methods. Commutability was assessed both on a pairwise basis and over the entire group of assays, using linear regression and correspondence analyses. Commutability of the WHO material differed among the tests that were evaluated, and these differences appeared to vary depending on the method of statistical analysis used and the cohort of assays included in the analysis. Depending on the methodology used, the WHO material showed poor or absent commutability with up to 50% of assays. Determination of commutability may require a multifaceted approach; the lack of commutability seen when using the WHO standard with several of the assays here suggests that further work is needed to bring us toward true consensus. PMID:26269622
Structural texture similarity metrics for image analysis and retrieval.
Zujovic, Jana; Pappas, Thrasyvoulos N; Neuhoff, David L
2013-07-01
We develop new metrics for texture similarity that accounts for human visual perception and the stochastic nature of textures. The metrics rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are essentially identical. The proposed metrics extend the ideas of structural similarity and are guided by research in texture analysis-synthesis. They are implemented using a steerable filter decomposition and incorporate a concise set of subband statistics, computed globally or in sliding windows. We conduct systematic tests to investigate metric performance in the context of "known-item search," the retrieval of textures that are "identical" to the query texture. This eliminates the need for cumbersome subjective tests, thus enabling comparisons with human performance on a large database. Our experimental results indicate that the proposed metrics outperform peak signal-to-noise ratio (PSNR), structural similarity metric (SSIM) and its variations, as well as state-of-the-art texture classification metrics, using standard statistical measures.
Evaluation of noise pollution level in the operating rooms of hospitals: A study in Iran.
Giv, Masoumeh Dorri; Sani, Karim Ghazikhanlou; Alizadeh, Majid; Valinejadi, Ali; Majdabadi, Hesamedin Askari
2017-06-01
Noise pollution in the operating rooms is one of the remaining challenges. Both patients and physicians are exposed to different sound levels during the operative cases, many of which can last for hours. This study aims to evaluate the noise pollution in the operating rooms during different surgical procedures. In this cross-sectional study, sound level in the operating rooms of Hamadan University-affiliated hospitals (totally 10) in Iran during different surgical procedures was measured using B&K sound meter. The gathered data were compared with national and international standards. Statistical analysis was performed using descriptive statistics and one-way ANOVA, t -test, and Pearson's correlation test. Noise pollution level at majority of surgical procedures is higher than national and international documented standards. The highest level of noise pollution is related to orthopedic procedures, and the lowest one related to laparoscopic and heart surgery procedures. The highest and lowest registered sound level during the operation was 93 and 55 dB, respectively. Sound level generated by equipments (69 ± 4.1 dB), trolley movement (66 ± 2.3 dB), and personnel conversations (64 ± 3.9 dB) are the main sources of noise. The noise pollution of operating rooms are higher than available standards. The procedure needs to be corrected for achieving the proper conditions.
Statistical inference for template aging
NASA Astrophysics Data System (ADS)
Schuckers, Michael E.
2006-04-01
A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.
Volcano plots in analyzing differential expressions with mRNA microarrays.
Li, Wentian
2012-12-01
A volcano plot displays unstandardized signal (e.g. log-fold-change) against noise-adjusted/standardized signal (e.g. t-statistic or -log(10)(p-value) from the t-test). We review the basic and interactive use of the volcano plot and its crucial role in understanding the regularized t-statistic. The joint filtering gene selection criterion based on regularized statistics has a curved discriminant line in the volcano plot, as compared to the two perpendicular lines for the "double filtering" criterion. This review attempts to provide a unifying framework for discussions on alternative measures of differential expression, improved methods for estimating variance, and visual display of a microarray analysis result. We also discuss the possibility of applying volcano plots to other fields beyond microarray.
Conservative Tests under Satisficing Models of Publication Bias.
McCrary, Justin; Christensen, Garret; Fanelli, Daniele
2016-01-01
Publication bias leads consumers of research to observe a selected sample of statistical estimates calculated by producers of research. We calculate critical values for statistical significance that could help to adjust after the fact for the distortions created by this selection effect, assuming that the only source of publication bias is file drawer bias. These adjusted critical values are easy to calculate and differ from unadjusted critical values by approximately 50%-rather than rejecting a null hypothesis when the t-ratio exceeds 2, the analysis suggests rejecting a null hypothesis when the t-ratio exceeds 3. Samples of published social science research indicate that on average, across research fields, approximately 30% of published t-statistics fall between the standard and adjusted cutoffs.
Conservative Tests under Satisficing Models of Publication Bias
McCrary, Justin; Christensen, Garret; Fanelli, Daniele
2016-01-01
Publication bias leads consumers of research to observe a selected sample of statistical estimates calculated by producers of research. We calculate critical values for statistical significance that could help to adjust after the fact for the distortions created by this selection effect, assuming that the only source of publication bias is file drawer bias. These adjusted critical values are easy to calculate and differ from unadjusted critical values by approximately 50%—rather than rejecting a null hypothesis when the t-ratio exceeds 2, the analysis suggests rejecting a null hypothesis when the t-ratio exceeds 3. Samples of published social science research indicate that on average, across research fields, approximately 30% of published t-statistics fall between the standard and adjusted cutoffs. PMID:26901834
Hypothesis testing of scientific Monte Carlo calculations.
Wallerberger, Markus; Gull, Emanuel
2017-11-01
The steadily increasing size of scientific Monte Carlo simulations and the desire for robust, correct, and reproducible results necessitates rigorous testing procedures for scientific simulations in order to detect numerical problems and programming bugs. However, the testing paradigms developed for deterministic algorithms have proven to be ill suited for stochastic algorithms. In this paper we demonstrate explicitly how the technique of statistical hypothesis testing, which is in wide use in other fields of science, can be used to devise automatic and reliable tests for Monte Carlo methods, and we show that these tests are able to detect some of the common problems encountered in stochastic scientific simulations. We argue that hypothesis testing should become part of the standard testing toolkit for scientific simulations.
Hypothesis testing of scientific Monte Carlo calculations
NASA Astrophysics Data System (ADS)
Wallerberger, Markus; Gull, Emanuel
2017-11-01
The steadily increasing size of scientific Monte Carlo simulations and the desire for robust, correct, and reproducible results necessitates rigorous testing procedures for scientific simulations in order to detect numerical problems and programming bugs. However, the testing paradigms developed for deterministic algorithms have proven to be ill suited for stochastic algorithms. In this paper we demonstrate explicitly how the technique of statistical hypothesis testing, which is in wide use in other fields of science, can be used to devise automatic and reliable tests for Monte Carlo methods, and we show that these tests are able to detect some of the common problems encountered in stochastic scientific simulations. We argue that hypothesis testing should become part of the standard testing toolkit for scientific simulations.
NASA Technical Reports Server (NTRS)
Ellis, David L.
2007-01-01
Room temperature tensile testing of Chemically Pure (CP) Titanium Grade 2 was conducted for as-received commercially produced sheet and following thermal exposure at 550 and 650 K for times up to 5,000 h. No significant changes in microstructure or failure mechanism were observed. A statistical analysis of the data was performed. Small statistical differences were found, but all properties were well above minimum values for CP Ti Grade 2 as defined by ASTM standards and likely would fall within normal variation of the material.
Standard Clock in primordial density perturbations and cosmic microwave background
NASA Astrophysics Data System (ADS)
Chen, Xingang; Namjoo, Mohammad Hossein
2014-12-01
Standard Clocks in the primordial epoch leave a special type of features in the primordial perturbations, which can be used to directly measure the scale factor of the primordial universe as a function of time a (t), thus discriminating between inflation and alternatives. We have started to search for such signals in the Planck 2013 data using the key predictions of the Standard Clock. In this Letter, we summarize the key predictions of the Standard Clock and present an interesting candidate example in Planck 2013 data. Motivated by this candidate, we construct and compute full Standard Clock models and use the more complete prediction to make more extensive comparison with data. Although this candidate is not yet statistically significant, we use it to illustrate how Standard Clocks appear in Cosmic Microwave Background (CMB) and how they can be further tested by future data. We also use it to motivate more detailed theoretical model building.
A Monte Carlo Simulation Study of the Reliability of Intraindividual Variability
Estabrook, Ryne; Grimm, Kevin J.; Bowles, Ryan P.
2012-01-01
Recent research has seen intraindividual variability (IIV) become a useful technique to incorporate trial-to-trial variability into many types of psychological studies. IIV as measured by individual standard deviations (ISDs) has shown unique prediction to several types of positive and negative outcomes (Ram, Rabbit, Stollery, & Nesselroade, 2005). One unanswered question regarding measuring intraindividual variability is its reliability and the conditions under which optimal reliability is achieved. Monte Carlo simulation studies were conducted to determine the reliability of the ISD compared to the intraindividual mean. The results indicate that ISDs generally have poor reliability and are sensitive to insufficient measurement occasions, poor test reliability, and unfavorable amounts and distributions of variability in the population. Secondary analysis of psychological data shows that use of individual standard deviations in unfavorable conditions leads to a marked reduction in statistical power, although careful adherence to underlying statistical assumptions allows their use as a basic research tool. PMID:22268793
A new item response theory model to adjust data allowing examinee choice
Costa, Marcelo Azevedo; Braga Oliveira, Rivert Paulo
2018-01-01
In a typical questionnaire testing situation, examinees are not allowed to choose which items they answer because of a technical issue in obtaining satisfactory statistical estimates of examinee ability and item difficulty. This paper introduces a new item response theory (IRT) model that incorporates information from a novel representation of questionnaire data using network analysis. Three scenarios in which examinees select a subset of items were simulated. In the first scenario, the assumptions required to apply the standard Rasch model are met, thus establishing a reference for parameter accuracy. The second and third scenarios include five increasing levels of violating those assumptions. The results show substantial improvements over the standard model in item parameter recovery. Furthermore, the accuracy was closer to the reference in almost every evaluated scenario. To the best of our knowledge, this is the first proposal to obtain satisfactory IRT statistical estimates in the last two scenarios. PMID:29389996
NASA Astrophysics Data System (ADS)
Kaiser, Mary Elizabeth; Morris, Matthew; Aldoroty, Lauren; Kurucz, Robert; McCandliss, Stephan; Rauscher, Bernard; Kimble, Randy; Kruk, Jeffrey; Wright, Edward L.; Feldman, Paul; Riess, Adam; Gardner, Jonathon; Bohlin, Ralph; Deustua, Susana; Dixon, Van; Sahnow, David J.; Perlmutter, Saul
2018-01-01
Establishing improved spectrophotometric standards is important for a broad range of missions and is relevant to many astrophysical problems. Systematic errors associated with astrophysical data used to probe fundamental astrophysical questions, such as SNeIa observations used to constrain dark energy theories, now exceed the statistical errors associated with merged databases of these measurements. ACCESS, “Absolute Color Calibration Experiment for Standard Stars”, is a series of rocket-borne sub-orbital missions and ground-based experiments designed to enable improvements in the precision of the astrophysical flux scale through the transfer of absolute laboratory detector standards from the National Institute of Standards and Technology (NIST) to a network of stellar standards with a calibration accuracy of 1% and a spectral resolving power of 500 across the 0.35‑1.7μm bandpass. To achieve this goal ACCESS (1) observes HST/ Calspec stars (2) above the atmosphere to eliminate telluric spectral contaminants (e.g. OH) (3) using a single optical path and (HgCdTe) detector (4) that is calibrated to NIST laboratory standards and (5) monitored on the ground and in-flight using a on-board calibration monitor. The observations are (6) cross-checked and extended through the generation of stellar atmosphere models for the targets. The ACCESS telescope and spectrograph have been designed, fabricated, and integrated. Subsystems have been tested. Performance results for subsystems, operations testing, and the integrated spectrograph will be presented. NASA sounding rocket grant NNX17AC83G supports this work.
Evidence-based orthodontics. Current statistical trends in published articles in one journal.
Law, Scott V; Chudasama, Dipak N; Rinchuse, Donald J
2010-09-01
To ascertain the number, type, and overall usage of statistics in American Journal of Orthodontics and Dentofacial (AJODO) articles for 2008. These data were then compared to data from three previous years: 1975, 1985, and 2003. The frequency and distribution of statistics used in the AJODO original articles for 2008 were dichotomized into those using statistics and those not using statistics. Statistical procedures were then broadly divided into descriptive statistics (mean, standard deviation, range, percentage) and inferential statistics (t-test, analysis of variance). Descriptive statistics were used to make comparisons. In 1975, 1985, 2003, and 2008, AJODO published 72, 87, 134, and 141 original articles, respectively. The percentage of original articles using statistics was 43.1% in 1975, 75.9% in 1985, 94.0% in 2003, and 92.9% in 2008; original articles using statistics stayed relatively the same from 2003 to 2008, with only a small 1.1% decrease. The percentage of articles using inferential statistical analyses was 23.7% in 1975, 74.2% in 1985, 92.9% in 2003, and 84.4% in 2008. Comparing AJODO publications in 2003 and 2008, there was an 8.5% increase in the use of descriptive articles (from 7.1% to 15.6%), and there was an 8.5% decrease in articles using inferential statistics (from 92.9% to 84.4%).
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES
He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.
2017-01-01
Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
Fish: A New Computer Program for Friendly Introductory Statistics Help
ERIC Educational Resources Information Center
Brooks, Gordon P.; Raffle, Holly
2005-01-01
All introductory statistics students must master certain basic descriptive statistics, including means, standard deviations and correlations. Students must also gain insight into such complex concepts as the central limit theorem and standard error. This article introduces and describes the Friendly Introductory Statistics Help (FISH) computer…
Detecting Multiple Model Components with the Likelihood Ratio Test
NASA Astrophysics Data System (ADS)
Protassov, R. S.; van Dyk, D. A.
2000-05-01
The likelihood ratio test (LRT) and F-test popularized in astrophysics by Bevington (Data Reduction and Error Analysis in the Physical Sciences ) and Cash (1977, ApJ 228, 939), do not (even asymptotically) adhere to their nominal χ2 and F distributions in many statistical tests commonly used in astrophysics. The many legitimate uses of the LRT (see, e.g., the examples given in Cash (1977)) notwithstanding, it can be impossible to compute the false positive rate of the LRT or related tests such as the F-test. For example, although Cash (1977) did not suggest the LRT for detecting a line profile in a spectral model, it has become common practice despite the lack of certain required mathematical regularity conditions. Contrary to common practice, the nominal distribution of the LRT statistic should not be used in these situations. In this paper, we characterize an important class of problems where the LRT fails, show the non-standard behavior of the test in this setting, and provide a Bayesian alternative to the LRT, i.e., posterior predictive p-values. We emphasize that there are many legitimate uses of the LRT in astrophysics, and even when the LRT is inappropriate, there remain several statistical alternatives (e.g., judicious use of error bars and Bayes factors). We illustrate this point in our analysis of GRB 970508 that was studied by Piro et al. in ApJ, 514:L73-L77, 1999.
NASA Astrophysics Data System (ADS)
Slaski, G.; Ohde, B.
2016-09-01
The article presents the results of a statistical dispersion analysis of an energy and power demand for tractive purposes of a battery electric vehicle. The authors compare data distribution for different values of an average speed in two approaches, namely a short and long period of observation. The short period of observation (generally around several hundred meters) results from a previously proposed macroscopic energy consumption model based on an average speed per road section. This approach yielded high values of standard deviation and coefficient of variation (the ratio between standard deviation and the mean) around 0.7-1.2. The long period of observation (about several kilometers long) is similar in length to standardized speed cycles used in testing a vehicle energy consumption and available range. The data were analysed to determine the impact of observation length on the energy and power demand variation. The analysis was based on a simulation of electric power and energy consumption performed with speed profiles data recorded in Poznan agglomeration.
Uncertainty Analysis of Instrument Calibration and Application
NASA Technical Reports Server (NTRS)
Tripp, John S.; Tcheng, Ping
1999-01-01
Experimental aerodynamic researchers require estimated precision and bias uncertainties of measured physical quantities, typically at 95 percent confidence levels. Uncertainties of final computed aerodynamic parameters are obtained by propagation of individual measurement uncertainties through the defining functional expressions. In this paper, rigorous mathematical techniques are extended to determine precision and bias uncertainties of any instrument-sensor system. Through this analysis, instrument uncertainties determined through calibration are now expressed as functions of the corresponding measurement for linear and nonlinear univariate and multivariate processes. Treatment of correlated measurement precision error is developed. During laboratory calibration, calibration standard uncertainties are assumed to be an order of magnitude less than those of the instrument being calibrated. Often calibration standards do not satisfy this assumption. This paper applies rigorous statistical methods for inclusion of calibration standard uncertainty and covariance due to the order of their application. The effects of mathematical modeling error on calibration bias uncertainty are quantified. The effects of experimental design on uncertainty are analyzed. The importance of replication is emphasized, techniques for estimation of both bias and precision uncertainties using replication are developed. Statistical tests for stationarity of calibration parameters over time are obtained.
Whitley, Heather P; Hanson, Courtney; Parton, Jason M
2017-03-01
This prospective longitudinal study compares diabetes screenings between standard practices vs systematically offered point-of-care (POC) hemoglobin A 1c (HbA 1c ) tests in patients aged 45 years or older. Systematically screened participants (n = 164) identified 63% (n = 104) with unknown hyperglycemia and 53% (n = 88) in prediabetes. The standard practice (n = 324) screened 22% (n = 73), most commonly by blood glucose (96%); 8% (n = 6) and 33% (n = 24) were found to have diabetes and prediabetes, respectively. The association between screening outcome and screening method was statistically significant ( P = 0.005) in favor of HbA 1C HbA 1c may be the most effective method to identify patients unknowingly living in hyperglycemia. Point-of-care tests further facilitate screening evaluation in a timely and feasible fashion. © 2017 Annals of Family Medicine, Inc.
Mortality investigation of workers in an electromagnetic pulse test program.
Muhm, J M
1992-03-01
A standardized mortality ratio study of 304 male employees of an electromagnetic pulse (EMP) test program was conducted. Outcomes were ascertained by two methods: the World Health Organization's underlying cause of death algorithm; and the National Center for Health Statistics' algorithm to identify multiple listed causes of death. In the 3362 person-years of follow-up, there was one underlying cause of death due to leukemia compared with with 0.2 expected (standard mortality ratio [SMR] = 437, 95% confidence interval [CI] = 11-2433), and two multiple listed causes of death due to leukemia compared with 0.3 expected (SMR = 775, 95% CI = 94-2801). Although the study suggested an association between death due to leukemia and employment in the EMP test program, firm conclusions could not be drawn because of limitations of the study. The findings warrant further investigation in an independent cohort.
Arevalo, Amanda; Kolobe, Thubi H A; Arnold, Sandra; DeGrace, Beth
2014-01-01
To examine whether parenting behaviors and childrearing practices in the first 3 years of life among Mexican American (MA) families predict children's academic performance at school age. Thirty-six children were assessed using the Parent Behavior Checklist, Nursing Child Assessment Teaching Scale, Home Observation for Measurement of the Environment Inventory, and Bayley Scales of Infant Development II. Academic performance was measured with the Illinois Standards Achievement Test during third grade. Correlation between parents' developmental expectations, nurturing behaviors, discipline, and academic performance were statistically significant (P < .05). Developmental expectations and discipline strategies predicted 30% of the variance in the Illinois Standards Achievement Test of reading. The results of this study suggest that early developmental expectations that MA parents have for their children, and the nurturing and discipline behaviors they engage in, are related to how well the children perform on academic tests at school age.
Identifying fMRI Model Violations with Lagrange Multiplier Tests
Cassidy, Ben; Long, Christopher J; Rae, Caroline; Solo, Victor
2013-01-01
The standard modeling framework in Functional Magnetic Resonance Imaging (fMRI) is predicated on assumptions of linearity, time invariance and stationarity. These assumptions are rarely checked because doing so requires specialised software, although failure to do so can lead to bias and mistaken inference. Identifying model violations is an essential but largely neglected step in standard fMRI data analysis. Using Lagrange Multiplier testing methods we have developed simple and efficient procedures for detecting model violations such as non-linearity, non-stationarity and validity of the common Double Gamma specification for hemodynamic response. These procedures are computationally cheap and can easily be added to a conventional analysis. The test statistic is calculated at each voxel and displayed as a spatial anomaly map which shows regions where a model is violated. The methodology is illustrated with a large number of real data examples. PMID:22542665
Zeng, Ping; Mukherjee, Sayan; Zhou, Xiang
2017-01-01
Epistasis, commonly defined as the interaction between multiple genes, is an important genetic component underlying phenotypic variation. Many statistical methods have been developed to model and identify epistatic interactions between genetic variants. However, because of the large combinatorial search space of interactions, most epistasis mapping methods face enormous computational challenges and often suffer from low statistical power due to multiple test correction. Here, we present a novel, alternative strategy for mapping epistasis: instead of directly identifying individual pairwise or higher-order interactions, we focus on mapping variants that have non-zero marginal epistatic effects—the combined pairwise interaction effects between a given variant and all other variants. By testing marginal epistatic effects, we can identify candidate variants that are involved in epistasis without the need to identify the exact partners with which the variants interact, thus potentially alleviating much of the statistical and computational burden associated with standard epistatic mapping procedures. Our method is based on a variance component model, and relies on a recently developed variance component estimation method for efficient parameter inference and p-value computation. We refer to our method as the “MArginal ePIstasis Test”, or MAPIT. With simulations, we show how MAPIT can be used to estimate and test marginal epistatic effects, produce calibrated test statistics under the null, and facilitate the detection of pairwise epistatic interactions. We further illustrate the benefits of MAPIT in a QTL mapping study by analyzing the gene expression data of over 400 individuals from the GEUVADIS consortium. PMID:28746338
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-01-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008–2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0. PMID:27892471
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-28
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
NASA Astrophysics Data System (ADS)
Mieth, Bettina; Kloft, Marius; Rodríguez, Juan Antonio; Sonnenburg, Sören; Vobruba, Robin; Morcillo-Suárez, Carlos; Farré, Xavier; Marigorta, Urko M.; Fehr, Ernst; Dickhaus, Thorsten; Blanchard, Gilles; Schunk, Daniel; Navarro, Arcadi; Müller, Klaus-Robert
2016-11-01
The standard approach to the analysis of genome-wide association studies (GWAS) is based on testing each position in the genome individually for statistical significance of its association with the phenotype under investigation. To improve the analysis of GWAS, we propose a combination of machine learning and statistical testing that takes correlation structures within the set of SNPs under investigation in a mathematically well-controlled manner into account. The novel two-step algorithm, COMBI, first trains a support vector machine to determine a subset of candidate SNPs and then performs hypothesis tests for these SNPs together with an adequate threshold correction. Applying COMBI to data from a WTCCC study (2007) and measuring performance as replication by independent GWAS published within the 2008-2015 period, we show that our method outperforms ordinary raw p-value thresholding as well as other state-of-the-art methods. COMBI presents higher power and precision than the examined alternatives while yielding fewer false (i.e. non-replicated) and more true (i.e. replicated) discoveries when its results are validated on later GWAS studies. More than 80% of the discoveries made by COMBI upon WTCCC data have been validated by independent studies. Implementations of the COMBI method are available as a part of the GWASpi toolbox 2.0.
The Same or Not the Same: Equivalence as an Issue in Educational Research
NASA Astrophysics Data System (ADS)
Lewis, Scott E.; Lewis, Jennifer E.
2005-09-01
In educational research, particularly in the sciences, a common research design calls for the establishment of a control and experimental group to determine the effectiveness of an intervention. As part of this design, it is often desirable to illustrate that the two groups were equivalent at the start of the intervention, based on measures such as standardized cognitive tests or student grades in prior courses. In this article we use SAT and ACT scores to illustrate a more robust way of testing equivalence. The method incorporates two one-sided t tests evaluating two null hypotheses, providing a stronger claim for equivalence than the standard method, which often does not address the possible problem of low statistical power. The two null hypotheses are based on the construction of an equivalence interval particular to the data, so the article also provides a rationale for and illustration of a procedure for constructing equivalence intervals. Our consideration of equivalence using this method also underscores the need to include sample sizes, standard deviations, and group means in published quantitative studies.
The best motivator priorities parents choose via analytical hierarchy process
NASA Astrophysics Data System (ADS)
Farah, R. N.; Latha, P.
2015-05-01
Motivation is probably the most important factor that educators can target in order to improve learning. Numerous cross-disciplinary theories have been postulated to explain motivation. While each of these theories has some truth, no single theory seems to adequately explain all human motivation. The fact is that human beings in general and pupils in particular are complex creatures with complex needs and desires. In this paper, Analytic Hierarchy Process (AHP) has been proposed as an emerging solution to move towards too large, dynamic and complex real world multi-criteria decision making problems in selecting the most suitable motivator when choosing school for their children. Data were analyzed using SPSS 17.0 ("Statistical Package for Social Science") software. Statistic testing used are descriptive and inferential statistic. Descriptive statistic used to identify respondent pupils and parents demographic factors. The statistical testing used to determine the pupils and parents highest motivator priorities and parents' best priorities using AHP to determine the criteria chosen by parents such as school principals, teachers, pupils and parents. The moderating factors are selected schools based on "Standard Kualiti Pendidikan Malaysia" (SKPM) in Ampang. Inferential statistics such as One-way ANOVA used to get the significant and data used to calculate the weightage of AHP. School principals is found to be the best motivator for parents in choosing school for their pupils followed by teachers, parents and pupils.
Graffelman, Jan; Weir, Bruce S
2018-02-01
Standard statistical tests for equality of allele frequencies in males and females and tests for Hardy-Weinberg equilibrium are tightly linked by their assumptions. Tests for equality of allele frequencies assume Hardy-Weinberg equilibrium, whereas the usual chi-square or exact test for Hardy-Weinberg equilibrium assume equality of allele frequencies in the sexes. In this paper, we propose ways to break this interdependence in assumptions of the two tests by proposing an omnibus exact test that can test both hypotheses jointly, as well as a likelihood ratio approach that permits these phenomena to be tested both jointly and separately. The tests are illustrated with data from the 1000 Genomes project. © 2017 The Authors Genetic Epidemiology Published by Wiley Periodicals, Inc.
77 FR 34044 - National Committee on Vital and Health Statistics: Meeting Standards Subcommittee
Federal Register 2010, 2011, 2012, 2013, 2014
2012-06-08
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Committee on Vital and Health Statistics: Meeting... Health Statistics (NCVHS); Subcommittee on Standards. Time and Date: June 20, 2012, 9 a.m.-5 p.m. EST..., Executive Secretary, NCVHS, National Center for Health Statistics, Centers for Disease Control and...
NASA Astrophysics Data System (ADS)
Silva, P. C. G.; Porto-Neto, S. T.; Lizarelli, R. F. Z.; Bagnato, V. S.
2008-03-01
We have investigated if a new LEDs system has enough efficient energy to promote efficient shear and tensile bonding strength resistance under standardized tests. LEDs 470 ± 10 nm can be used to photocure composite during bracket fixation. Advantages considering resistance to tensile and shear bonding strength when these systems were used are necessary to justify their clinical use. Forty eight human extracted premolars teeth and two light sources were selected, one halogen lamp and a LEDs system. Brackets for premolar were bonded through composite resin. Samples were submitted to standardized tests. A comparison between used sources under shear bonding strength test, obtained similar results; however, tensile bonding test showed distinct results: a statistical difference at a level of 1% between exposure times (40 and 60 seconds) and even to an interaction between light source and exposure time. The best result was obtained with halogen lamp use by 60 seconds, even during re-bonding; however LEDs system can be used for bonding and re-bonding brackets if power density could be increased.
ERIC Educational Resources Information Center
Karkee, Thakur; Choi, Seung
2005-01-01
Proper maintenance of a scale established in the baseline year would assure the accurate estimation of growth in subsequent years. Scale maintenance is especially important when the state performance standards must be preserved for future administrations. To ensure proper maintenance of a scale, the selection of anchor items and evaluation of…
A statistical study of the relationship between surface quality and laser induced damage
NASA Astrophysics Data System (ADS)
Turner, Trey; Turchette, Quentin; Martin, Alex R.
2012-11-01
Laser induced damage of optical components is a concern in many applications in the commercial, scientific and military market sectors. Numerous component manufacturers supply "high laser damage threshold" (HLDT) optics to meet the needs of this market, and consumers pay a premium price for these products. While there's no question that HLDT optics are manufactured to more rigorous standards (and are therefore inherently more expensive) than conventional products, it is not clear how this added expense translates directly into better performance. This is because the standard methods for evaluating laser damage, and the underlying assumptions about the validity of traditional laser damage testing, are flawed. In particular, the surface and coating defects that generally lead to laser damage (in many laserparameter regimes of interest) are widely distributed over the component surface with large spaces in between them. As a result, laser damage testing typically doesn't include enough of these defects to achieve the sample sizes necessary to make its results statistically meaningful. The result is a poor correlation between defect characteristics and damage events. This paper establishes specifically why this is the case, and provides some indication of what might be done to remedy the problem.
Influence of skin peeling procedure in allergic contact dermatitis.
Kim, Jung Eun; Park, Hyun Jeong; Cho, Baik Kee; Lee, Jun Young
2008-03-01
The prevalence of allergic contact dermatitis in patients who have previously undergone skin peeling has been rarely studied. We compared the frequency of positive patch test (PT) reactions in a patient group with a history of peeling, to that of a control group with no history of peeling. The Korean standard series and cosmetic series were performed on a total of 262 patients. 62 patients had previously undergone peeling and 200 patients did not. The frequency of positive PT reactions on Korean standard series was significantly higher in the peeling group compared with that of the control group (P < 0.05, chi-square test). However, the most commonly identified allergens were mostly cosmetic-unrelated allergens. The frequency of positive PT reactions on cosmetic series in the peeling group was higher than that of the control group, but lacked statistical significance. The frequency (%) of positive PT reactions on cosmetic series in the high-frequency peel group was higher than that of the low-frequency group, but lacked statistical significance. It appears peeling may not generally affect the development of contact sensitization. Further work is required focusing on the large-scale prospective studies by performing a PT before and after peeling.
NASA Astrophysics Data System (ADS)
Mussen, Kimberly S.
This quantitative research study evaluated the effectiveness of employing pedagogy based on the theory of multiple intelligences (MI). Currently, not all students are performing at the rate mandated by the government. When schools do not meet the required state standards, the school is labeled as not achieving adequate yearly progress (AYP), which may lead to the loss of funding. Any school not achieving AYP would be interested in this study. Due to low state standardized test scores in the district for science, student achievement and attitudes towards learning science were evaluated on a pretest, posttest, essay question, and one attitudinal survey. Statistical significance existed on one of the four research questions. Utilizing the Analysis of Covariance (ANCOVA) for data analysis, student attitudes towards learning science were statically significant in the MI (experimental) group. No statistical significance was found in student achievement on the posttest, delayed posttest, or the essay question test. Social change can result from this study because studying the effects of the multiple intelligence theory incorporated into classroom instruction can have significant effect on how children learn, allowing them to compete in a knowledge society.
Khodeir, Mona S; Hegazi, Mona A; Saleh, Marwa M
2018-03-19
The aim of this study was to standardize an Egyptian Arabic Pragmatic Language Test (EAPLT) using linguistically and socially suitable questions and pictures in order to be able to address specific deficits in this language domain. Questions and pictures were designed for the EAPLT to assess 3 pragmatic language subsets: pragmatic skills, functions, and factors. Ten expert phoniatricians were asked to review the EAPLT and complete a questionnaire to assess the validity of the test items. The EAPLT was applied in 120 typically developing Arabic-speaking Egyptian children (64 females and 56 males) randomly selected by inclusion and exclusion criteria in the age range between 2 years, 1 month, 1 day and 9 years, 12 months, 31 days. Children's scores were used to calculate the means and standard deviations and the 5th and 95th percentiles to determine the age of the pragmatic skills acquisition. All experts have mostly agreed that the EAPLT gives a general idea about children's pragmatic language development. Test-retest reliability analysis proved the high reliability and internal consistency of the EAPLT subsets. A statistically significant correlation was found between the test subsets and age. The EAPLT is a valid and reliable Egyptian Arabic test that can be applied in order to detect a pragmatic language delay. © 2018 S. Karger AG, Basel.
Juhel-Gaugain, M; McEvoy, J D; VanGinkel, L A
2000-12-01
The experimental design of a material certification programme is described. The matrix reference materials (RMs) comprised chlortetracycline (CTC)-containing and CTC-free lyophilised porcine liver, kidney and muscle produced under the European Commission's Standards Measurements and Testing (SMT) programme. The aim of the certification programme was to determine accurately and precisely the concentration of CTC and 4-epi-chlortetracycline (epi-CTC) contained in the RMs. A multi-laboratory approach was used to certify analyte concentrations. Participants (n = 19) were instructed to strictly adhere to previously established guidelines. Following the examination of analytical performance criteria, statistical manipulation of results submitted by 13 laboratories, (6 withdrew) allowed an estimate to be made of the true value of the analyte content. The Nalimov test was used for detection of outlying results. The Cochran and Bartlett tests were employed for testing the homogeneity of variances. The normality of results distribution was tested according to the Kolmogorov-Smirnov-Lilliefors test. One-way analysis of variance (ANOVA) was employed to calculate the within and between-laboratory standard deviations, the overall mean and confidence interval for the CTC and epi-CTC content of each of the RMs. Certified values were within or very close to the target concentration ranges specified in the SMT contract. These studies have demonstrated the successful production and certification of CTC-containing and CTC-free porcine RMs.
rpsftm: An R Package for Rank Preserving Structural Failure Time Models
Allison, Annabel; White, Ian R; Bond, Simon
2018-01-01
Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ, is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z(ψ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm. PMID:29564164
rpsftm: An R Package for Rank Preserving Structural Failure Time Models.
Allison, Annabel; White, Ian R; Bond, Simon
2017-12-04
Treatment switching in a randomised controlled trial occurs when participants change from their randomised treatment to the other trial treatment during the study. Failure to account for treatment switching in the analysis (i.e. by performing a standard intention-to-treat analysis) can lead to biased estimates of treatment efficacy. The rank preserving structural failure time model (RPSFTM) is a method used to adjust for treatment switching in trials with survival outcomes. The RPSFTM is due to Robins and Tsiatis (1991) and has been developed by White et al. (1997, 1999). The method is randomisation based and uses only the randomised treatment group, observed event times, and treatment history in order to estimate a causal treatment effect. The treatment effect, ψ , is estimated by balancing counter-factual event times (that would be observed if no treatment were received) between treatment groups. G-estimation is used to find the value of ψ such that a test statistic Z ( ψ ) = 0. This is usually the test statistic used in the intention-to-treat analysis, for example, the log rank test statistic. We present an R package that implements the method of rpsftm.
Repeatability of Cryogenic Multilayer Insulation
NASA Technical Reports Server (NTRS)
Johnson, W. L.; Vanderlaan, M.; Wood, J. J.; Rhys, N. O.; Guo, W.; Van Sciver, S.; Chato, D. J.
2017-01-01
Due to the variety of requirements across aerospace platforms, and one off projects, the repeatability of cryogenic multilayer insulation has never been fully established. The objective of this test program is to provide a more basic understanding of the thermal performance repeatability of MLI systems that are applicable to large scale tanks. There are several different types of repeatability that can be accounted for: these include repeatability between multiple identical blankets, repeatability of installation of the same blanket, and repeatability of a test apparatus. The focus of the work in this report is on the first two types of repeatability. Statistically, repeatability can mean many different things. In simplest form, it refers to the range of performance that a population exhibits and the average of the population. However, as more and more identical components are made (i.e. the population of concern grows), the simple range morphs into a standard deviation from an average performance. Initial repeatability testing on MLI blankets has been completed at Florida State University. Repeatability of five GRC provided coupons with 25 layers was shown to be +/- 8.4 whereas repeatability of repeatedly installing a single coupon was shown to be +/- 8.0. A second group of 10 coupons have been fabricated by Yetispace and tested by Florida State University, through the first 4 tests, the repeatability has been shown to be +/- 16. Based on detailed statistical analysis, the data has been shown to be statistically significant.
Repeatability of Cryogenic Multilayer Insulation
NASA Technical Reports Server (NTRS)
Johnson, W. L.; Vanderlaan, M.; Wood, J. J.; Rhys, N. O.; Guo, W.; Van Sciver, S.; Chato, D. J.
2017-01-01
Due to the variety of requirements across aerospace platforms, and one off projects, the repeatability of cryogenic multilayer insulation has never been fully established. The objective of this test program is to provide a more basic understanding of the thermal performance repeatability of MLI systems that are applicable to large scale tanks. There are several different types of repeatability that can be accounted for: these include repeatability between multiple identical blankets, repeatability of installation of the same blanket, and repeatability of a test apparatus. The focus of the work in this report is on the first two types of repeatability. Statistically, repeatability can mean many different things. In simplest form, it refers to the range of performance that a population exhibits and the average of the population. However, as more and more identical components are made (i.e. the population of concern grows), the simple range morphs into a standard deviation from an average performance. Initial repeatability testing on MLI blankets has been completed at Florida State University. Repeatability of five GRC provided coupons with 25 layers was shown to be +/- 8.4% whereas repeatability of repeatedly installing a single coupon was shown to be +/- 8.0%. A second group of 10 coupons have been fabricated by Yetispace and tested by Florida State University, through the first 4 tests, the repeatability has been shown to be +/- 15-25%. Based on detailed statistical analysis, the data has been shown to be statistically significant.
Ye, Xin; Garikapati, Venu M.; You, Daehyun; ...
2017-11-08
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ye, Xin; Garikapati, Venu M.; You, Daehyun
Most multinomial choice models (e.g., the multinomial logit model) adopted in practice assume an extreme-value Gumbel distribution for the random components (error terms) of utility functions. This distributional assumption offers a closed-form likelihood expression when the utility maximization principle is applied to model choice behaviors. As a result, model coefficients can be easily estimated using the standard maximum likelihood estimation method. However, maximum likelihood estimators are consistent and efficient only if distributional assumptions on the random error terms are valid. It is therefore critical to test the validity of underlying distributional assumptions on the error terms that form the basismore » of parameter estimation and policy evaluation. In this paper, a practical yet statistically rigorous method is proposed to test the validity of the distributional assumption on the random components of utility functions in both the multinomial logit (MNL) model and multiple discrete-continuous extreme value (MDCEV) model. Based on a semi-nonparametric approach, a closed-form likelihood function that nests the MNL or MDCEV model being tested is derived. The proposed method allows traditional likelihood ratio tests to be used to test violations of the standard Gumbel distribution assumption. Simulation experiments are conducted to demonstrate that the proposed test yields acceptable Type-I and Type-II error probabilities at commonly available sample sizes. The test is then applied to three real-world discrete and discrete-continuous choice models. For all three models, the proposed test rejects the validity of the standard Gumbel distribution in most utility functions, calling for the development of robust choice models that overcome adverse effects of violations of distributional assumptions on the error terms in random utility functions.« less
The estimation of the measurement results with using statistical methods
NASA Astrophysics Data System (ADS)
Velychko, O.; Gordiyenko, T.
2015-02-01
The row of international standards and guides describe various statistical methods that apply for a management, control and improvement of processes with the purpose of realization of analysis of the technical measurement results. The analysis of international standards and guides on statistical methods estimation of the measurement results recommendations for those applications in laboratories is described. For realization of analysis of standards and guides the cause-and-effect Ishikawa diagrams concerting to application of statistical methods for estimation of the measurement results are constructed.
Improved score statistics for meta-analysis in single-variant and gene-level association studies.
Yang, Jingjing; Chen, Sai; Abecasis, Gonçalo
2018-06-01
Meta-analysis is now an essential tool for genetic association studies, allowing them to combine large studies and greatly accelerating the pace of genetic discovery. Although the standard meta-analysis methods perform equivalently as the more cumbersome joint analysis under ideal settings, they result in substantial power loss under unbalanced settings with various case-control ratios. Here, we investigate the power loss problem by the standard meta-analysis methods for unbalanced studies, and further propose novel meta-analysis methods performing equivalently to the joint analysis under both balanced and unbalanced settings. We derive improved meta-score-statistics that can accurately approximate the joint-score-statistics with combined individual-level data, for both linear and logistic regression models, with and without covariates. In addition, we propose a novel approach to adjust for population stratification by correcting for known population structures through minor allele frequencies. In the simulated gene-level association studies under unbalanced settings, our method recovered up to 85% power loss caused by the standard methods. We further showed the power gain of our methods in gene-level tests with 26 unbalanced studies of age-related macular degeneration . In addition, we took the meta-analysis of three unbalanced studies of type 2 diabetes as an example to discuss the challenges of meta-analyzing multi-ethnic samples. In summary, our improved meta-score-statistics with corrections for population stratification can be used to construct both single-variant and gene-level association studies, providing a useful framework for ensuring well-powered, convenient, cross-study analyses. © 2018 WILEY PERIODICALS, INC.
NASA Astrophysics Data System (ADS)
Burns, Dana
Over the last two decades, online education has become a popular concept in universities as well as K-12 education. This generation of students has grown up using technology and has shown interest in incorporating technology into their learning. The idea of using technology in the classroom to enhance student learning and create higher achievement has become necessary for administrators, teachers, and policymakers. Although online education is a popular topic, there has been minimal research on the effectiveness of online and blended learning strategies compared to the student learning in a traditional K-12 classroom setting. The purpose of this study was to investigate differences in standardized test scores from the Biology End of Course exam when at-risk students completed the course using three different educational models: online format, blended learning, and traditional face-to-face learning. Data was collected from over 1,000 students over a five year time period. Correlation analyzed data from standardized tests scores of eighth grade students was used to define students as "at-risk" for failing high school courses. The results indicated a high correlation between eighth grade standardized test scores and Biology End of Course exam scores. These students were deemed "at-risk" for failing high school courses. Standardized test scores were measured for the at-risk students when those students completed Biology in the different models of learning. Results indicated significant differences existed among the learning models. Students had the highest test scores when completing Biology in the traditional face-to-face model. Further evaluation of subgroup populations indicated statistical differences in learning models for African-American populations, female students, and for male students.
Medical ethical standards in dermatology: an analytical study of knowledge, attitudes and practices.
Mostafa, W Z; Abdel Hay, R M; El Lawindi, M I
2015-01-01
Dermatology practice has not been ethically justified at all times. The objective of the study was to find out dermatologists' knowledge about medical ethics, their attitudes towards regulatory measures and their practices, and to study the different factors influencing the knowledge, the attitude and the practices of dermatologists. This is a cross-sectional comparative study conducted among 214 dermatologists, from five Academic Universities and from participants in two conferences. A 54 items structured anonymous questionnaire was designed to describe the demographical characteristics of the study group as well as their knowledge, attitude and practices regarding the medical ethics standards in clinical and research settings. Five scoring indices were estimated regarding knowledge, attitude and practice. Inferential statistics were used to test differences between groups as indicated. The Student's t-test and analysis of variance were carried out for quantitative variables. The chi-squared test was conducted for qualitative variables. The results were considered statistically significant at a P > 0.05. Analysis of the possible factors having impact on the overall scores revealed that the highest knowledge scores were among dermatologists who practice in an academic setting plus an additional place; however, this difference was statistically non-significant (P = 0.060). Female dermatologists showed a higher attitude score compared to males (P = 0.028). The highest significant attitude score (P = 0.019) regarding clinical practice was recorded among those practicing cosmetic dermatology. The different studied groups of dermatologists revealed a significant impact on the attitude score (P = 0.049), and the evidence-practice score (P < 0.001). Ethical practices will improve the quality and integrity of dermatology research. © 2014 European Academy of Dermatology and Venereology.
Exocrine Dysfunction Correlates with Endocrinal Impairment of Pancreas in Type 2 Diabetes Mellitus
Prasanna Kumar, H. R.; Gowdappa, H. Basavana; Hosmani, Tejashwi; Urs, Tejashri
2018-01-01
Background: Diabetes mellitus (DM) is a chronic abnormal metabolic condition, which manifests elevated blood sugar level over a prolonged period. The pancreatic endocrine system generally gets affected during diabetes, but often abnormal exocrine functions are also manifested due to its proximity to the endocrine system. Fecal elastase-1 (FE-1) is found to be an ideal biomarker to reflect the exocrine insufficiency of the pancreas. Aim: The aim of this study was conducted to assess exocrine dysfunction of the pancreas in patients with type-2 DM (T2DM) by measuring FE levels and to associate the level of hyperglycemia with exocrine pancreatic dysfunction. Methodology: A prospective, cross-sectional comparative study was conducted on both T2DM patients and healthy nondiabetic volunteers. FE-1 levels were measured using a commercial kit (Human Pancreatic Elastase ELISA BS 86-01 from Bioserv Diagnostics). Data analysis was performed based on the important statistical parameters such as mean, standard deviation, standard error, t-test-independent samples, and Chi-square test/cross tabulation using SPSS for Windows version 20.0. Results: Statistically nonsignificant (P = 0.5051) relationship between FE-1 deficiency and age was obtained, which implied age as a noncontributing factor toward exocrine pancreatic insufficiency among diabetic patients. Statistically significant correlation (P = 0.003) between glycated hemoglobin and FE-1 levels was also noted. The association between retinopathy (P = 0.001) and peripheral pulses (P = 0.001) with FE-1 levels were found to be statistically significant. Conclusion: This study validates the benefit of FE-1 estimation, as a surrogate marker of exocrine pancreatic insufficiency, which remains unmanifest and subclinical. PMID:29535950
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A; Lu, Zhong-Lin; Myung, Jay I
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias.
Gu, Hairong; Kim, Woojae; Hou, Fang; Lesmes, Luis Andres; Pitt, Mark A.; Lu, Zhong-Lin; Myung, Jay I.
2016-01-01
Measurement efficiency is of concern when a large number of observations are required to obtain reliable estimates for parametric models of vision. The standard entropy-based Bayesian adaptive testing procedures addressed the issue by selecting the most informative stimulus in sequential experimental trials. Noninformative, diffuse priors were commonly used in those tests. Hierarchical adaptive design optimization (HADO; Kim, Pitt, Lu, Steyvers, & Myung, 2014) further improves the efficiency of the standard Bayesian adaptive testing procedures by constructing an informative prior using data from observers who have already participated in the experiment. The present study represents an empirical validation of HADO in estimating the human contrast sensitivity function. The results show that HADO significantly improves the accuracy and precision of parameter estimates, and therefore requires many fewer observations to obtain reliable inference about contrast sensitivity, compared to the method of quick contrast sensitivity function (Lesmes, Lu, Baek, & Albright, 2010), which uses the standard Bayesian procedure. The improvement with HADO was maintained even when the prior was constructed from heterogeneous populations or a relatively small number of observers. These results of this case study support the conclusion that HADO can be used in Bayesian adaptive testing by replacing noninformative, diffuse priors with statistically justified informative priors without introducing unwanted bias. PMID:27105061
High Impact = High Statistical Standards? Not Necessarily So
Tressoldi, Patrizio E.; Giofré, David; Sella, Francesco; Cumming, Geoff
2013-01-01
What are the statistical practices of articles published in journals with a high impact factor? Are there differences compared with articles published in journals with a somewhat lower impact factor that have adopted editorial policies to reduce the impact of limitations of Null Hypothesis Significance Testing? To investigate these questions, the current study analyzed all articles related to psychological, neuropsychological and medical issues, published in 2011 in four journals with high impact factors: Science, Nature, The New England Journal of Medicine and The Lancet, and three journals with relatively lower impact factors: Neuropsychology, Journal of Experimental Psychology-Applied and the American Journal of Public Health. Results show that Null Hypothesis Significance Testing without any use of confidence intervals, effect size, prospective power and model estimation, is the prevalent statistical practice used in articles published in Nature, 89%, followed by articles published in Science, 42%. By contrast, in all other journals, both with high and lower impact factors, most articles report confidence intervals and/or effect size measures. We interpreted these differences as consequences of the editorial policies adopted by the journal editors, which are probably the most effective means to improve the statistical practices in journals with high or low impact factors. PMID:23418533
High impact = high statistical standards? Not necessarily so.
Tressoldi, Patrizio E; Giofré, David; Sella, Francesco; Cumming, Geoff
2013-01-01
What are the statistical practices of articles published in journals with a high impact factor? Are there differences compared with articles published in journals with a somewhat lower impact factor that have adopted editorial policies to reduce the impact of limitations of Null Hypothesis Significance Testing? To investigate these questions, the current study analyzed all articles related to psychological, neuropsychological and medical issues, published in 2011 in four journals with high impact factors: Science, Nature, The New England Journal of Medicine and The Lancet, and three journals with relatively lower impact factors: Neuropsychology, Journal of Experimental Psychology-Applied and the American Journal of Public Health. Results show that Null Hypothesis Significance Testing without any use of confidence intervals, effect size, prospective power and model estimation, is the prevalent statistical practice used in articles published in Nature, 89%, followed by articles published in Science, 42%. By contrast, in all other journals, both with high and lower impact factors, most articles report confidence intervals and/or effect size measures. We interpreted these differences as consequences of the editorial policies adopted by the journal editors, which are probably the most effective means to improve the statistical practices in journals with high or low impact factors.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arutyunyan, R.V.; Bol`shov, L.A.; Vasil`ev, S.K.
1994-06-01
The objective of this study was to clarify a number of issues related to the spatial distribution of contaminants from the Chernobyl accident. The effects of local statistics were addressed by collecting and analyzing (for Cesium 137) soil samples from a number of regions, and it was found that sample activity differed by a factor of 3-5. The effect of local non-uniformity was estimated by modeling the distribution of the average activity of a set of five samples for each of the regions, with the spread in the activities for a {+-}2 range being equal to 25%. The statistical characteristicsmore » of the distribution of contamination were then analyzed and found to be a log-normal distribution with the standard deviation being a function of test area. All data for the Bryanskaya Oblast area were analyzed statistically and were adequately described by a log-normal function.« less
Validating Coherence Measurements Using Aligned and Unaligned Coherence Functions
NASA Technical Reports Server (NTRS)
Miles, Jeffrey Hilton
2006-01-01
This paper describes a novel approach based on the use of coherence functions and statistical theory for sensor validation in a harsh environment. By the use of aligned and unaligned coherence functions and statistical theory one can test for sensor degradation, total sensor failure or changes in the signal. This advanced diagnostic approach and the novel data processing methodology discussed provides a single number that conveys this information. This number as calculated with standard statistical procedures for comparing the means of two distributions is compared with results obtained using Yuen's robust statistical method to create confidence intervals. Examination of experimental data from Kulite pressure transducers mounted in a Pratt & Whitney PW4098 combustor using spectrum analysis methods on aligned and unaligned time histories has verified the effectiveness of the proposed method. All the procedures produce good results which demonstrates how robust the technique is.
Tests for qualitative treatment-by-centre interaction using a 'pushback' procedure.
Ciminera, J L; Heyse, J F; Nguyen, H H; Tukey, J W
1993-06-15
In multicentre clinical trials using a common protocol, the centres are usually regarded as being a fixed factor, thus allowing any treatment-by-centre interaction to be omitted from the error term for the effect of treatment. However, we feel it necessary to use the treatment-by-centre interaction as the error term if there is substantial evidence that the interaction with centres is qualitative instead of quantitative. To make allowance for the estimated uncertainties of the centre means, we propose choosing a reference value (for example, the median of the ordered array of centre means) and converting the individual centre results into standardized deviations from the reference value. The deviations are then reordered, and the results 'pushed back' by amounts appropriate for the corresponding order statistics in a sample from the relevant distribution. The pushed-back standardized deviations are then restored to the original scale. The appearance of opposite signs among the destandardized values for the various centres is then taken as 'substantial evidence' of qualitative interaction. Procedures are presented using, in any combination: (i) Gaussian, or Student's t-distribution; (ii) order-statistic medians or outward 90 per cent points of the corresponding order statistic distributions; (iii) pooling or grouping and pooling the internally estimated standard deviations of the centre means. The use of the least conservative combination--Student's t, outward 90 per cent points, grouping and pooling--is recommended.
Development of QC Procedures for Ocean Data Obtained by National Research Projects of Korea
NASA Astrophysics Data System (ADS)
Kim, S. D.; Park, H. M.
2017-12-01
To establish data management system for ocean data obtained by national research projects of Ministry of Oceans and Fisheries of Korea, KIOST conducted standardization and development of QC procedures. After reviewing and analyzing the existing international and domestic ocean-data standards and QC procedures, the draft version of standards and QC procedures were prepared. The proposed standards and QC procedures were reviewed and revised by experts in the field of oceanography and academic societies several times. A technical report on the standards of 25 data items and 12 QC procedures for physical, chemical, biological and geological data items. The QC procedure for temperature and salinity data was set up by referring the manuals published by GTSPP, ARGO and IOOS QARTOD. It consists of 16 QC tests applicable for vertical profile data and time series data obtained in real-time mode and delay mode. Three regional range tests to inspect annual, seasonal and monthly variations were included in the procedure. Three programs were developed to calculate and provide upper limit and lower limit of temperature and salinity at depth from 0 to 1550m. TS data of World Ocean Database, ARGO, GTSPP and in-house data of KIOST were analysed statistically to calculate regional limit of Northwest Pacific area. Based on statistical analysis, the programs calculate regional ranges using mean and standard deviation at 3 kind of grid systems (3° grid, 1° grid and 0.5° grid) and provide recommendation. The QC procedures for 12 data items were set up during 1st phase of national program for data management (2012-2015) and are being applied to national research projects practically at 2nd phase (2016-2019). The QC procedures will be revised by reviewing the result of QC application when the 2nd phase of data management programs is completed.
Kim, Da-Eun; Yang, Hyeri; Jang, Won-Hee; Jung, Kyoung-Mi; Park, Miyoung; Choi, Jin Kyu; Jung, Mi-Sook; Jeon, Eun-Young; Heo, Yong; Yeo, Kyung-Wook; Jo, Ji-Hoon; Park, Jung Eun; Sohn, Soo Jung; Kim, Tae Sung; Ahn, Il Young; Jeong, Tae-Cheon; Lim, Kyung-Min; Bae, SeungJin
2016-01-01
In order for a novel test method to be applied for regulatory purposes, its reliability and relevance, i.e., reproducibility and predictive capacity, must be demonstrated. Here, we examine the predictive capacity of a novel non-radioisotopic local lymph node assay, LLNA:BrdU-FCM (5-bromo-2'-deoxyuridine-flow cytometry), with a cutoff approach and inferential statistics as a prediction model. 22 reference substances in OECD TG429 were tested with a concurrent positive control, hexylcinnamaldehyde 25%(PC), and the stimulation index (SI) representing the fold increase in lymph node cells over the vehicle control was obtained. The optimal cutoff SI (2.7≤cutoff <3.5), with respect to predictive capacity, was obtained by a receiver operating characteristic curve, which produced 90.9% accuracy for the 22 substances. To address the inter-test variability in responsiveness, SI values standardized with PC were employed to obtain the optimal percentage cutoff (42.6≤cutoff <57.3% of PC), which produced 86.4% accuracy. A test substance may be diagnosed as a sensitizer if a statistically significant increase in SI is elicited. The parametric one-sided t-test and non-parametric Wilcoxon rank-sum test produced 77.3% accuracy. Similarly, a test substance could be defined as a sensitizer if the SI means of the vehicle control, and of the low, middle, and high concentrations were statistically significantly different, which was tested using ANOVA or Kruskal-Wallis, with post hoc analysis, Dunnett, or DSCF (Dwass-Steel-Critchlow-Fligner), respectively, depending on the equal variance test, producing 81.8% accuracy. The absolute SI-based cutoff approach produced the best predictive capacity, however the discordant decisions between prediction models need to be examined further. Copyright © 2015 Elsevier Inc. All rights reserved.
A Simple Test of Class-Level Genetic Association Can Reveal Novel Cardiometabolic Trait Loci.
Qian, Jing; Nunez, Sara; Reed, Eric; Reilly, Muredach P; Foulkes, Andrea S
2016-01-01
Characterizing the genetic determinants of complex diseases can be further augmented by incorporating knowledge of underlying structure or classifications of the genome, such as newly developed mappings of protein-coding genes, epigenetic marks, enhancer elements and non-coding RNAs. We apply a simple class-level testing framework, termed Genetic Class Association Testing (GenCAT), to identify protein-coding gene association with 14 cardiometabolic (CMD) related traits across 6 publicly available genome wide association (GWA) meta-analysis data resources. GenCAT uses SNP-level meta-analysis test statistics across all SNPs within a class of elements, as well as the size of the class and its unique correlation structure, to determine if the class is statistically meaningful. The novelty of findings is evaluated through investigation of regional signals. A subset of findings are validated using recently updated, larger meta-analysis resources. A simulation study is presented to characterize overall performance with respect to power, control of family-wise error and computational efficiency. All analysis is performed using the GenCAT package, R version 3.2.1. We demonstrate that class-level testing complements the common first stage minP approach that involves individual SNP-level testing followed by post-hoc ascribing of statistically significant SNPs to genes and loci. GenCAT suggests 54 protein-coding genes at 41 distinct loci for the 13 CMD traits investigated in the discovery analysis, that are beyond the discoveries of minP alone. An additional application to biological pathways demonstrates flexibility in defining genetic classes. We conclude that it would be prudent to include class-level testing as standard practice in GWA analysis. GenCAT, for example, can be used as a simple, complementary and efficient strategy for class-level testing that leverages existing data resources, requires only summary level data in the form of test statistics, and adds significant value with respect to its potential for identifying multiple novel and clinically relevant trait associations.
Jabłoński, Sławomir; Rzepkowska-Misiak, Beata; Piskorz, Łukasz; Brocki, Marian; Wcisło, Szymon; Smigielski, Jacek; Kordiak, Jacek
2012-01-01
Introduction Hyperhidrosis is excessive sweating beyond the needs of thermoregulation. It is disease which mostly affects young people, often carrying a considerable amount of socio-economic implications. Thoracic sympathectomy is now considered to be the "gold standard" in the treatment of idiopathic hyperhidrosis of hands and armpits. Aim Assessment of early effectiveness of thoracic sympathectomy using skin resistance measurements performed before surgery and in the postoperative period. Material and methods A group of 20 patients with idiopathic excessive sweating of hands and the armpit was enrolled in the study. Patients underwent two-stage thoracic sympathectomy with resection of Th2-Th4 ganglions. The skin resistance measurements were made at six previously designated points on the day of surgery and the first day after the operation. Results In all operated patients we obtained complete remission of symptoms on the first day after the surgery. Inhibition of sweating was confirmed using the standard starch iodine (Minor) test. At all measurement points we obtained a statistically significant increase of skin resistance, assuming p < 0.05. To check whether there is a statistically significant difference in the results before and after surgery we used sequence pairs Wilcoxon test. Conclusions Thoracic sympathectomy is an effective curative treatment for primary hyperhidrosis of hands and armpits. Statistically significant increase of skin resistance in all cases is a good method of assessing the effectiveness of the above surgery in the early postoperative period. PMID:23256019
78 FR 65317 - National Committee on Vital and Health Statistics: Meeting Standards Subcommittee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-10-31
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Committee on Vital and Health Statistics: Meeting... Health Statistics (NCVHS) Subcommittee on Standards. Time and Date: November 12, 2013 8:30 a.m.-5:30 p.m. EST. Place: Centers for Disease Control and Prevention, National Center for Health Statistics, 3311...
78 FR 54470 - National Committee on Vital and Health Statistics: Meeting Standards Subcommittee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-09-04
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Committee on Vital and Health Statistics: Meeting... Health Statistics (NCVHS) Subcommittee on Standards Time and Date: September 18, 2013 8:30 p.m.--5:00 p.m. EDT. Place: Centers for Disease Control and Prevention, National Center for Health Statistics, 3311...
78 FR 942 - National Committee on Vital and Health Statistics: Meeting Standards Subcommittee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-01-07
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Committee on Vital and Health Statistics: Meeting... Health Statistics (NCVHS) Subcommittee on Standards. Time and Date: February 27, 2013 9:30 a.m.-5:00 p.m... electronic claims attachments. The National Committee on Vital Health Statistics is the public advisory body...
78 FR 34100 - National Committee on Vital and Health Statistics: Meeting Standards Subcommittee
Federal Register 2010, 2011, 2012, 2013, 2014
2013-06-06
... DEPARTMENT OF HEALTH AND HUMAN SERVICES National Committee on Vital and Health Statistics: Meeting... Health Statistics (NCVHS) Subcommittee on Standards. Time and Date: June 17, 2013 1:00 p.m.-5:00 p.m. e.d..., National Center for Health Statistics, 3311 Toledo Road, Auditorium B & C, Hyattsville, Maryland 20782...
An asymptotic analysis of the logrank test.
Strawderman, R L
1997-01-01
Asymptotic expansions for the null distribution of the logrank statistic and its distribution under local proportional hazards alternatives are developed in the case of iid observations. The results, which are derived from the work of Gu (1992) and Taniguchi (1992), are easy to interpret, and provide some theoretical justification for many behavioral characteristics of the logrank test that have been previously observed in simulation studies. We focus primarily upon (i) the inadequacy of the usual normal approximation under treatment group imbalance; and, (ii) the effects of treatment group imbalance on power and sample size calculations. A simple transformation of the logrank statistic is also derived based on results in Konishi (1991) and is found to substantially improve the standard normal approximation to its distribution under the null hypothesis of no survival difference when there is treatment group imbalance.
Evaluation of bending rigidity behaviour of ultrasonic seaming on woven fabrics
NASA Astrophysics Data System (ADS)
Şevkan Macit, Ayşe; Tiber, Bahar
2017-10-01
In recent years ultrasonic seaming that is shown as an alternative method to conventional seaming has been investigated by many researchers. In our study, bending behaviour of this alternative method is examined by changing various parameters such as fabric type, seam type, roller type and seaming velocity. For this purpose fifteen types of sewn fabrics were tested according to bending rigidity test standard before and after washing processes and results were evaluated through SPSS statistical analyze programme. Consequently, bending length values of the ultrasonically sewn fabrics are found to be higher than the bending length values of conventionally sewn fabrics and the effects of seam type on bending length are seen statistically significant. Also it is observed that bending length values are in relationship with the rest of the parameters excluding roller type.
Montei, Carolyn; McDougal, Susan; Mozola, Mark; Rice, Jennifer
2014-01-01
The Soleris Non-fermenting Total Viable Count method was previously validated for a wide variety of food products, including cocoa powder. A matrix extension study was conducted to validate the method for use with cocoa butter and cocoa liquor. Test samples included naturally contaminated cocoa liquor and cocoa butter inoculated with natural microbial flora derived from cocoa liquor. A probability of detection statistical model was used to compare Soleris results at multiple test thresholds (dilutions) with aerobic plate counts determined using the AOAC Official Method 966.23 dilution plating method. Results of the two methods were not statistically different at any dilution level in any of the three trials conducted. The Soleris method offers the advantage of results within 24 h, compared to the 48 h required by standard dilution plating methods.
Significant-Loophole-Free Test of Bell's Theorem with Entangled Photons.
Giustina, Marissa; Versteegh, Marijn A M; Wengerowsky, Sören; Handsteiner, Johannes; Hochrainer, Armin; Phelan, Kevin; Steinlechner, Fabian; Kofler, Johannes; Larsson, Jan-Åke; Abellán, Carlos; Amaya, Waldimar; Pruneri, Valerio; Mitchell, Morgan W; Beyer, Jörn; Gerrits, Thomas; Lita, Adriana E; Shalm, Lynden K; Nam, Sae Woo; Scheidl, Thomas; Ursin, Rupert; Wittmann, Bernhard; Zeilinger, Anton
2015-12-18
Local realism is the worldview in which physical properties of objects exist independently of measurement and where physical influences cannot travel faster than the speed of light. Bell's theorem states that this worldview is incompatible with the predictions of quantum mechanics, as is expressed in Bell's inequalities. Previous experiments convincingly supported the quantum predictions. Yet, every experiment requires assumptions that provide loopholes for a local realist explanation. Here, we report a Bell test that closes the most significant of these loopholes simultaneously. Using a well-optimized source of entangled photons, rapid setting generation, and highly efficient superconducting detectors, we observe a violation of a Bell inequality with high statistical significance. The purely statistical probability of our results to occur under local realism does not exceed 3.74×10^{-31}, corresponding to an 11.5 standard deviation effect.
NASA Astrophysics Data System (ADS)
Slater, Stephanie
2009-05-01
The Test Of Astronomy STandards (TOAST) assessment instrument is a multiple-choice survey tightly aligned to the consensus learning goals stated by the American Astronomical Society - Chair's Conference on ASTRO 101, the American Association of the Advancement of Science's Project 2061 Benchmarks, and the National Research Council's National Science Education Standards. Researchers from the Cognition in Astronomy, Physics and Earth sciences Research (CAPER) Team at the University of Wyoming's Science and Math Teaching Center (UWYO SMTC) have been conducting a question-by-question distractor analysis procedure to determine the sensitivity and effectiveness of each item. In brief, the frequency each possible answer choice, known as a foil or distractor on a multiple-choice test, is determined and compared to the existing literature on the teaching and learning of astronomy. In addition to having statistical difficulty and discrimination values, a well functioning assessment item will show students selecting distractors in the relative proportions to how we expect them to respond based on known misconceptions and reasoning difficulties. In all cases, our distractor analysis suggests that all items are functioning as expected. These results add weight to the validity of the Test Of Astronomy STandards (TOAST) assessment instrument, which is designed to help instructors and researchers measure the impact of course-length duration instructional strategies for undergraduate science survey courses with learning goals tightly aligned to the consensus goals of the astronomy education community.
Evaluating effectiveness of dynamic soundfield system in the classroom.
da Cruz, Aline Duarte; Alves Silvério, Kelly Cristina; Da Costa, Aline Roberta Aceituno; Moret, Adriane Lima Mortari; Lauris, José Roberto Pereira; de Souza Jacob, Regina Tangerino
2016-01-01
Research has reported on the use of soundfield amplification devices in the classroom. However, no study has used standardized tests to determine the potential advantages of the dynamic soundfield system for normally hearing students and for the teacher's voice. Our aim was to evaluate the impact of using dynamic soundfield system on the noise of the classroom, teacher's voice and students' academic performance. This was a prospective cohort study in which 20 student participants enrolled in the third year of basic education were divided into two groups (i.e., control and experimental); their teacher participated. The experimental group was exposed to the dynamic soundfield system for 3 consecutive months. The groups were assessed using standardized tests to evaluate their academic performance. Further, questionnaires and statements were collected on the participants' experience of using the soundfield system. We statistically analyzed the results to compare the academic performance of the control group with that of the experimental group. In all cases, a significance level of P < .05 was adopted. Use of the dynamic soundfield system was effective for improving the students' academic performance on standardized tests for reading, improving the teacher's speech intelligibility, and reducing the teacher's vocal strain. The dynamic soundfield system minimizes the impact of noise in the classroom as demonstrated by the mensuration of the signal-to-noise ratio (SNR) and pupil performance on standardized tests for reading and student and teacher ratings of amplification system effectiveness.
Statistical reporting inconsistencies in experimental philosophy
Colombo, Matteo; Duev, Georgi; Nuijten, Michèle B.; Sprenger, Jan
2018-01-01
Experimental philosophy (x-phi) is a young field of research in the intersection of philosophy and psychology. It aims to make progress on philosophical questions by using experimental methods traditionally associated with the psychological and behavioral sciences, such as null hypothesis significance testing (NHST). Motivated by recent discussions about a methodological crisis in the behavioral sciences, questions have been raised about the methodological standards of x-phi. Here, we focus on one aspect of this question, namely the rate of inconsistencies in statistical reporting. Previous research has examined the extent to which published articles in psychology and other behavioral sciences present statistical inconsistencies in reporting the results of NHST. In this study, we used the R package statcheck to detect statistical inconsistencies in x-phi, and compared rates of inconsistencies in psychology and philosophy. We found that rates of inconsistencies in x-phi are lower than in the psychological and behavioral sciences. From the point of view of statistical reporting consistency, x-phi seems to do no worse, and perhaps even better, than psychological science. PMID:29649220
Advances in Statistical Methods for Substance Abuse Prevention Research
MacKinnon, David P.; Lockwood, Chondra M.
2010-01-01
The paper describes advances in statistical methods for prevention research with a particular focus on substance abuse prevention. Standard analysis methods are extended to the typical research designs and characteristics of the data collected in prevention research. Prevention research often includes longitudinal measurement, clustering of data in units such as schools or clinics, missing data, and categorical as well as continuous outcome variables. Statistical methods to handle these features of prevention data are outlined. Developments in mediation, moderation, and implementation analysis allow for the extraction of more detailed information from a prevention study. Advancements in the interpretation of prevention research results include more widespread calculation of effect size and statistical power, the use of confidence intervals as well as hypothesis testing, detailed causal analysis of research findings, and meta-analysis. The increased availability of statistical software has contributed greatly to the use of new methods in prevention research. It is likely that the Internet will continue to stimulate the development and application of new methods. PMID:12940467
On Teaching about the Coefficient of Variation in Introductory Statistics Courses
ERIC Educational Resources Information Center
Trafimow, David
2014-01-01
The standard deviation is related to the mean by virtue of the coefficient of variation. Teachers of statistics courses can make use of that fact to make the standard deviation more comprehensible for statistics students.
North American Contact Dermatitis Group patch test results: 2009 to 2010.
Warshaw, Erin M; Belsito, Donald V; Taylor, James S; Sasseville, Denis; DeKoven, Joel G; Zirwas, Matthew J; Fransway, Anthony F; Mathias, C G Toby; Zug, Kathryn A; DeLeo, Vincent A; Fowler, Joseph F; Marks, James G; Pratt, Melanie D; Storrs, Frances J; Maibach, Howard I
2013-01-01
Patch testing is an important diagnostic tool for determination of substances responsible for allergic contact dermatitis. This study reports the North American Contact Dermatitis Group (NACDG) patch testing results from January 1, 2009, to December 31, 2010. At 12 centers in North America, patients were tested in a standardized manner with a screening series of 70 allergens. Data were manually verified and entered into a central database. Descriptive frequencies were calculated, and trends were analyzed using χ2 statistics. A total of 4308 patients were tested. Of these, 2614 (60.7%) had at least 1 positive reaction, and 2284 (46.3%) were ultimately determined to have a primary diagnosis of allergic contact dermatitis. Four hundred twenty-seven (9.9%) patients had occupationally related skin disease. There were 6855 positive allergic reactions. As compared with the previous reporting period (2007-2008), the positive reaction rates statistically decreased for 20 allergens (nickel, neomycin, Myroxylon pereirae, cobalt, formaldehyde, quaternium 15, methydibromoglutaronitrile/phenoxyethanol, methylchlorisothiazolinone/methylisothiazolinone, potassium dichromate, diazolidinyl urea, propolis, dimethylol dimethylhydantoin, 2-bromo-2-nitro-1,3-propanediol, methyl methacrylate, ethyl acrylate, glyceryl thioglycolate, dibucaine, amidoamine, clobetasol, and dimethyloldihydroxyethyleneurea; P < 0.05) and statistically increased for 4 allergens (fragrance mix II, iodopropynyl butylcarbamate, propylene glycol, and benzocaine; P < 0.05). Approximately one quarter of tested patients had at least 1 relevant allergic reaction to a non-NACDG allergen. Hypothetically, approximately one quarter of reactions detected by NACDG allergens would have been missed by TRUE TEST (SmartPractice Denmark, Hillerød, Denmark). These results affirm the value of patch testing with many allergens.
Zheng, Jie; Harris, Marcelline R; Masci, Anna Maria; Lin, Yu; Hero, Alfred; Smith, Barry; He, Yongqun
2016-09-14
Statistics play a critical role in biological and clinical research. However, most reports of scientific results in the published literature make it difficult for the reader to reproduce the statistical analyses performed in achieving those results because they provide inadequate documentation of the statistical tests and algorithms applied. The Ontology of Biological and Clinical Statistics (OBCS) is put forward here as a step towards solving this problem. The terms in OBCS including 'data collection', 'data transformation in statistics', 'data visualization', 'statistical data analysis', and 'drawing a conclusion based on data', cover the major types of statistical processes used in basic biological research and clinical outcome studies. OBCS is aligned with the Basic Formal Ontology (BFO) and extends the Ontology of Biomedical Investigations (OBI), an OBO (Open Biological and Biomedical Ontologies) Foundry ontology supported by over 20 research communities. Currently, OBCS comprehends 878 terms, representing 20 BFO classes, 403 OBI classes, 229 OBCS specific classes, and 122 classes imported from ten other OBO ontologies. We discuss two examples illustrating how the ontology is being applied. In the first (biological) use case, we describe how OBCS was applied to represent the high throughput microarray data analysis of immunological transcriptional profiles in human subjects vaccinated with an influenza vaccine. In the second (clinical outcomes) use case, we applied OBCS to represent the processing of electronic health care data to determine the associations between hospital staffing levels and patient mortality. Our case studies were designed to show how OBCS can be used for the consistent representation of statistical analysis pipelines under two different research paradigms. Other ongoing projects using OBCS for statistical data processing are also discussed. The OBCS source code and documentation are available at: https://github.com/obcs/obcs . The Ontology of Biological and Clinical Statistics (OBCS) is a community-based open source ontology in the domain of biological and clinical statistics. OBCS is a timely ontology that represents statistics-related terms and their relations in a rigorous fashion, facilitates standard data analysis and integration, and supports reproducible biological and clinical research.
Estimating error statistics for Chambon-la-Forêt observatory definitive data
NASA Astrophysics Data System (ADS)
Lesur, Vincent; Heumez, Benoît; Telali, Abdelkader; Lalanne, Xavier; Soloviev, Anatoly
2017-08-01
We propose a new algorithm for calibrating definitive observatory data with the goal of providing users with estimates of the data error standard deviations (SDs). The algorithm has been implemented and tested using Chambon-la-Forêt observatory (CLF) data. The calibration process uses all available data. It is set as a large, weakly non-linear, inverse problem that ultimately provides estimates of baseline values in three orthogonal directions, together with their expected standard deviations. For this inverse problem, absolute data error statistics are estimated from two series of absolute measurements made within a day. Similarly, variometer data error statistics are derived by comparing variometer data time series between different pairs of instruments over few years. The comparisons of these time series led us to use an autoregressive process of order 1 (AR1 process) as a prior for the baselines. Therefore the obtained baselines do not vary smoothly in time. They have relatively small SDs, well below 300 pT when absolute data are recorded twice a week - i.e. within the daily to weekly measures recommended by INTERMAGNET. The algorithm was tested against the process traditionally used to derive baselines at CLF observatory, suggesting that statistics are less favourable when this latter process is used. Finally, two sets of definitive data were calibrated using the new algorithm. Their comparison shows that the definitive data SDs are less than 400 pT and may be slightly overestimated by our process: an indication that more work is required to have proper estimates of absolute data error statistics. For magnetic field modelling, the results show that even on isolated sites like CLF observatory, there are very localised signals over a large span of temporal frequencies that can be as large as 1 nT. The SDs reported here encompass signals of a few hundred metres and less than a day wavelengths.
Sources of Error and the Statistical Formulation of M S: m b Seismic Event Screening Analysis
NASA Astrophysics Data System (ADS)
Anderson, D. N.; Patton, H. J.; Taylor, S. R.; Bonner, J. L.; Selby, N. D.
2014-03-01
The Comprehensive Nuclear-Test-Ban Treaty (CTBT), a global ban on nuclear explosions, is currently in a ratification phase. Under the CTBT, an International Monitoring System (IMS) of seismic, hydroacoustic, infrasonic and radionuclide sensors is operational, and the data from the IMS is analysed by the International Data Centre (IDC). The IDC provides CTBT signatories basic seismic event parameters and a screening analysis indicating whether an event exhibits explosion characteristics (for example, shallow depth). An important component of the screening analysis is a statistical test of the null hypothesis H 0: explosion characteristics using empirical measurements of seismic energy (magnitudes). The established magnitude used for event size is the body-wave magnitude (denoted m b) computed from the initial segment of a seismic waveform. IDC screening analysis is applied to events with m b greater than 3.5. The Rayleigh wave magnitude (denoted M S) is a measure of later arriving surface wave energy. Magnitudes are measurements of seismic energy that include adjustments (physical correction model) for path and distance effects between event and station. Relative to m b, earthquakes generally have a larger M S magnitude than explosions. This article proposes a hypothesis test (screening analysis) using M S and m b that expressly accounts for physical correction model inadequacy in the standard error of the test statistic. With this hypothesis test formulation, the 2009 Democratic Peoples Republic of Korea announced nuclear weapon test fails to reject the null hypothesis H 0: explosion characteristics.
Hayat, Matthew J
2014-04-01
Statistics coursework is usually a core curriculum requirement for nursing students at all degree levels. The American Association of Colleges of Nursing (AACN) establishes curriculum standards for academic nursing programs. However, the AACN provides little guidance on statistics education and does not offer standardized competency guidelines or recommendations about course content or learning objectives. Published standards may be used in the course development process to clarify course content and learning objectives. This article includes suggestions for implementing and integrating recommendations given in the Guidelines for Assessment and Instruction in Statistics Education (GAISE) report into statistics education for nursing students. Copyright 2014, SLACK Incorporated.
Sánchez Cuén, Jaime Alberto; Irineo Cabrales, Ana Bertha; León Sicairos, Nidia Maribel; Calderón Zamora, Loranda; Monroy Higuera, Luis; Canizalez Román, Vicente Adrián
2017-11-01
After eradication treatment for Helicobacter pylori, infection could recur due to recrudescence or re-infection. The objective of this study was to determine the recurrence of Helicobacter pylori infection and identify virulent Helicobacter pylori strains one year after eradication with standard triple therapy. A quasi-experimental study was performed that included a patient population with digestive diseases associated with Helicobacter pylori who had received standard triple therapy. Cultures and Polymerase Chain Reaction was performed on gastric biopsies for strain identification in all patients prior to eradication treatment and those with a positive carbon 14 breath test one year after eradication treatment. Statistical analysis was performed using the student T test and Fisher's exact test, statistical significance was set at 0.05. 128 patients were studied, 51 (39.8%) were male and 77 (60.2%) were female with an average age of 54.8 years (DE 13.8). There was an annual recurrence of Helicobacter pylori infection in 12 (9.3%) patients. An annual re-infection and recrudescence occurred in 9 (7 %) and 3 (2.3%) patients respectively. The recrudescence rate for cagA was 1/30 (3.3%) patients and 2/112 (1.8%) patients for vacA. The re-infection rate for cagA was 3/30 (10%) patients and 6/112 (5.3%) patients for vacA. The recurrence of infection in this study was higher than that recorded in developed countries with a low prevalence of H. pylori and lower than that recorded in developing countries with a higher prevalence of H. pylori. The cagA or vacA s2/m2 strains were isolated after re-infection and recrudescence.
Castro-Almarales, Raúl Lázaro; Álvarez-Castelló, Mirta; Ronquillo-Díaz, Mercedes; Rodríguez-Canosa, José S; González-León, Mayda; Navarro-Viltre, Bárbara I; Betancourt-Mesia, Daniel; Enríquez-Domínguez, Irene; Reyes-Zamora, Mary Carmen; Oliva-Díaz, Yunia; Mateo-Morejón, Maytee; Labrada-Rosado, Alexis
2016-01-01
Diagnostic options for immune reactions to mosquito bites are limited. In Cuba, IgE-mediated reactions are frequently related to Culex quinquefasciatus bite. To determine the sensitivity and specificity of skin prick test with two doses of standardized extract in nitrogen protein units (PNU of Culex quinquefasciatus (BIOCEN, Cuba). An analytical study was conducted on 100 children between 2 and 15 years old. Fifty atopic patients with a history of allergy to mosquito bite and positive specific serum IgE Culex quinquefasciatus and fifty atopic patients without a history of allergy to mosquito bite and negative specific serum IgE to Culex quinquefasciatus. Skin prick tests (SPT) were performed by duplicates on the forearms of the patients. Investigated doses were 100 PNU/mL and 10 PNU/mL. SPT with the highest concentration obtained a mean wheal size of 22.09 mm2 and for lower doses of 8.09 mm2, a statistically significant difference (p=0.001, Student's t test). Positive skin test correlated in 100% of patients with the presence of specific IgE. Testing with both doses showed a 94% of specificity and 88% of sensitivity. The diagnostic accuracy of SPT using both doses of standardized extract was similar, which justifies its use for diagnosis of sensitization to Culex quinquefasciatus in patients with symptoms of allergy to mosquito bite.
Chemokine Prostate Cancer Biomarkers — EDRN Public Portal
STUDY DESIGN 1. The need for pre-validation studies. Preliminary data from our laboratory demonstrates a potential utility for CXCL5 and CXCL12 as biomarkers to distinguish between patients at high-risk versus low-risk for harboring prostate malignancies. However, this pilot and feasibility study utilized a very small sample size of 51 patients, which limited the ability of this study to adequately assess certain technical aspects of the ELISA technique and statistical aspects of we propose studies designed assess the robustness (Specific Aim 1) and predictive value (Specific Aim 2) of these markers in a larger study population. 2. ELISA Assays. Serum, plasma, or urine chemokine levels are assessed using 50 ul frozen specimen per sandwich ELISA in duplicate using the appropriate commercially-available capture antibodies, detection antibodies, and standard ELISA reagents (R&D; Systems), as we have described previously (15, 17, 18). Measures within each patient group are regarded as biological replicates and permit statistical comparisons between groups. For all ELISAs, a standard curve is generated with the provided standards and utilized to calculate the quantity of chemokine in the sample tested. These assays provide measures of protein concentration with excellent reproducibility, with replicate measures characterized by standard deviations from the mean on the order of <3%.
An experiment with spectral analysis of emotional speech affected by orthodontic appliances
NASA Astrophysics Data System (ADS)
Přibil, Jiří; Přibilová, Anna; Ďuračková, Daniela
2012-11-01
The contribution describes the effect of the fixed and removable orthodontic appliances on spectral properties of emotional speech. Spectral changes were analyzed and evaluated by spectrograms and mean Welch’s periodograms. This alternative approach to the standard listening test enables to obtain objective comparison based on statistical analysis by ANOVA and hypothesis tests. Obtained results of analysis performed on short sentences of a female speaker in four emotional states (joyous, sad, angry, and neutral) show that, first of all, the removable orthodontic appliance affects the spectrograms of produced speech.
Permutation tests for goodness-of-fit testing of mathematical models to experimental data.
Fişek, M Hamit; Barlas, Zeynep
2013-03-01
This paper presents statistical procedures for improving the goodness-of-fit testing of theoretical models to data obtained from laboratory experiments. We use an experimental study in the expectation states research tradition which has been carried out in the "standardized experimental situation" associated with the program to illustrate the application of our procedures. We briefly review the expectation states research program and the fundamentals of resampling statistics as we develop our procedures in the resampling context. The first procedure we develop is a modification of the chi-square test which has been the primary statistical tool for assessing goodness of fit in the EST research program, but has problems associated with its use. We discuss these problems and suggest a procedure to overcome them. The second procedure we present, the "Average Absolute Deviation" test, is a new test and is proposed as an alternative to the chi square test, as being simpler and more informative. The third and fourth procedures are permutation versions of Jonckheere's test for ordered alternatives, and Kendall's tau(b), a rank order correlation coefficient. The fifth procedure is a new rank order goodness-of-fit test, which we call the "Deviation from Ideal Ranking" index, which we believe may be more useful than other rank order tests for assessing goodness-of-fit of models to experimental data. The application of these procedures to the sample data is illustrated in detail. We then present another laboratory study from an experimental paradigm different from the expectation states paradigm - the "network exchange" paradigm, and describe how our procedures may be applied to this data set. Copyright © 2012 Elsevier Inc. All rights reserved.
Truly random number generation: an example
NASA Astrophysics Data System (ADS)
Frauchiger, Daniela; Renner, Renato
2013-10-01
Randomness is crucial for a variety of applications, ranging from gambling to computer simulations, and from cryptography to statistics. However, many of the currently used methods for generating randomness do not meet the criteria that are necessary for these applications to work properly and safely. A common problem is that a sequence of numbers may look random but nevertheless not be truly random. In fact, the sequence may pass all standard statistical tests and yet be perfectly predictable. This renders it useless for many applications. For example, in cryptography, the predictability of a "andomly" chosen password is obviously undesirable. Here, we review a recently developed approach to generating true | and hence unpredictable | randomness.
Measurement of the relationship between perceived and computed color differences
NASA Astrophysics Data System (ADS)
García, Pedro A.; Huertas, Rafael; Melgosa, Manuel; Cui, Guihua
2007-07-01
Using simulated data sets, we have analyzed some mathematical properties of different statistical measurements that have been employed in previous literature to test the performance of different color-difference formulas. Specifically, the properties of the combined index PF/3 (performance factor obtained as average of three terms), widely employed in current literature, have been considered. A new index named standardized residual sum of squares (STRESS), employed in multidimensional scaling techniques, is recommended. The main difference between PF/3 and STRESS is that the latter is simpler and allows inferences on the statistical significance of two color-difference formulas with respect to a given set of visual data.
Statistical testing of the full-range leadership theory in nursing.
Kanste, Outi; Kääriäinen, Maria; Kyngäs, Helvi
2009-12-01
The aim of this study is to test statistically the structure of the full-range leadership theory in nursing. The data were gathered by postal questionnaires from nurses and nurse leaders working in healthcare organizations in Finland. A follow-up study was performed 1 year later. The sample consisted of 601 nurses and nurse leaders, and the follow-up study had 78 respondents. Theory was tested through structural equation modelling, standard regression analysis and two-way anova. Rewarding transformational leadership seems to promote and passive laissez-faire leadership to reduce willingness to exert extra effort, perceptions of leader effectiveness and satisfaction with the leader. Active management-by-exception seems to reduce willingness to exert extra effort and perception of leader effectiveness. Rewarding transformational leadership remained as a strong explanatory factor of all outcome variables measured 1 year later. The data supported the main structure of the full-range leadership theory, lending support to the universal nature of the theory.
Jansma, J Martijn; de Zwart, Jacco A; van Gelderen, Peter; Duyn, Jeff H; Drevets, Wayne C; Furey, Maura L
2013-01-01
Technical developments in MRI have improved signal to noise, allowing use of analysis methods such as Finite impulse response (FIR) of rapid event related functional MRI (er-fMRI). FIR is one of the most informative analysis methods as it determines onset and full shape of the hemodynamic response function (HRF) without any a-priori assumptions. FIR is however vulnerable to multicollinearity, which is directly related to the distribution of stimuli over time. Efficiency can be optimized by simplifying a design, and restricting stimuli distribution to specific sequences, while more design flexibility necessarily reduces efficiency. However, the actual effect of efficiency on fMRI results has never been tested in vivo. Thus, it is currently difficult to make an informed choice between protocol flexibility and statistical efficiency. The main goal of this study was to assign concrete fMRI signal to noise values to the abstract scale of FIR statistical efficiency. Ten subjects repeated a perception task with five random and m-sequence based protocol, with varying but, according to literature, acceptable levels of multicollinearity. Results indicated substantial differences in signal standard deviation, while the level was a function of multicollinearity. Experiment protocols varied up to 55.4% in standard deviation. Results confirm that quality of fMRI in an FIR analysis can significantly and substantially vary with statistical efficiency. Our in vivo measurements can be used to aid in making an informed decision between freedom in protocol design and statistical efficiency. PMID:23473798
Hall, Elizabeth A.; Prendergast, Michael L.; Roll, John M.; Warda, Umme
2010-01-01
This study assessed a 26-week voucher-based intervention to reinforce abstinence and participation in treatment-related activities among substance-abusing offenders court referred to outpatient treatment under drug diversion legislation (California's Substance Abuse and Crime Prevention Act). Standard treatment consisted of criminal justice supervision and an evidence-based model for treating stimulant abuse. Participants were randomly assigned to four groups, standard treatment (ST) only, ST plus vouchers for testing negative, ST plus vouchers for performing treatment plan activities, and ST plus vouchers for testing negative and/or performing treatment plan activities. Results indicate that voucher-based reinforcement of negative urines and of treatment plan tasks (using a flat reinforcement schedule) showed no statistically significant effects on measures of retention or drug use relative to the standard treatment protocol. It is likely that criminal justice contingencies had a stronger impact on participants' treatment retention and drug use than the relatively low-value vouchers awarded as part of the treatment protocol. PMID:20463918
Nevada Applied Ecology Group procedures handbook for environmental transuranics
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, M.G.; Dunaway, P.B.
The activities of the Nevada Applied Ecology Group (NAEG) integrated research studies of environmental plutonium and other transuranics at the Nevada Test Site have required many standardized field and laboratory procedures. These include sampling techniques, collection and preparation, radiochemical and wet chemistry analysis, data bank storage and reporting, and statistical considerations for environmental samples of soil, vegetation, resuspended particles, animals, and others. This document, printed in two volumes, includes most of the Nevada Applied Ecology Group standard procedures, with explanations as to the specific applications involved in the environmental studies. Where there is more than one document concerning a procedure,more » it has been included to indicate special studies or applications perhaps more complex than the routine standard sampling procedures utilized.« less
Nevada Applied Ecology Group procedures handbook for environmental transuranics
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, M.G.; Dunaway, P.B.
The activities of the Nevada Applied Ecology Group (NAEG) integrated research studies of environmental plutonium and other transuranics at the Nevada Test Site have required many standardized field and laboratory procedures. These include sampling techniques, collection and preparation, radiochemical and wet chemistry analysis, data bank storage and reporting, and statistical considerations for environmental samples of soil, vegetation, resuspended particles, animals, and other biological material. This document, printed in two volumes, includes most of the Nevada Applied Ecology Group standard procedures, with explanations as to the specific applications involved in the environmental studies. Where there is more than one document concerningmore » a procedure, it has been included to indicate special studies or applications more complex than the routine standard sampling procedures utilized.« less
ERIC Educational Resources Information Center
Hines, Robert James
The study conducted in the Buffalo, New York standard metropolitan statistical area, was undertaken to formulate and test a simple model of labor supply for a local labor market. The principal variables to be examined to determine the external supply function of labor to the establishment are variants of the rate of change of the entry wage and…
ERIC Educational Resources Information Center
Witmer, David R.
A search for better predictors of incomes of high school and college graduates is described. The accuracy of the prediction, implicit in the work of John R. Walsh of Harvard University, that the income differences in a given year are good indicators of income differences in future years, was tested by applying standard statistical procedures to…
The Black-White Difference in Youth Employment: Evidence for Demand-Side Factors.
ERIC Educational Resources Information Center
Cain, Glen G.; Finnie, Ross
The 1980 Census of the United States is used, first, to illustrate the serious lag in employment performance of young black men relative to young white men and, second, to test for the importance of demand-side causes of this lag. Aggregate data for 94 standard metropolitan statistical areas (SMSAs) contain data on the annual hours worked in 1979…
ERIC Educational Resources Information Center
Vinson, R. B.
In this report, the author suggests changes in the treatment of overhead costs by hypothesizing that "the effectiveness of standard costing in planning and controlling overhead costs can be increased through the use of probability theory and associated statistical techniques." To test the hypothesis, the author (1) presents an overview of the…
The impact of testing accommodations on MCAT scores: descriptive results.
Julian, Ellen R; Ingersoll, Deborah J; Etienne, Patricia M; Hilger, Anthony E
2004-04-01
Medical College Admission Test (MCAT) examinees with disabilities who receive accommodations receive flagged scores indicating nonstandard administration. This report compares MCAT examinees who received accommodations and their performances with standard examinees. Aggregate history records of all 1994-2000 MCAT examinees were identified as flagged (2,401) or standard (297,880), then further sorted by race/ethnicity (broadly identified as underrepresented minority and non-URM, at the time of testing) and gender. Those with flagged scores were also classified by disability (LD = learning disability, ADHD = attention deficit hyperactivity disorder, LD/ADHD = learning disability and attention deficit hyperactivity disorder, and Other = other disability) and type of accommodation. Mean MCAT scores were calculated for all groups. A group of 866 examinees took the MCAT first as a standard administration and subsequently with accommodations. In a separate analysis, their two sets of scores were compared. Less than 1% of examinees (2,401) had accommodations; of these, 55% were LD, 17% ADHD, 5% LD/ADHD, and 23% Other. Extended time was the most frequently provided accommodation. Mean flagged scores slightly exceeded mean standard scores on all MCAT sections. Examinees who retook the MCAT with accommodations after a standard administration increased their scores by six points, quadrupling the average gain Standard-Standard retest cohort from another study. The small but statistically significant different higher flagged scores may reflect either appropriate compensation or overly generous accommodations. Extended time had a positive impact on the scores of those who retested with this accommodation. The validity the flagged MCAT in predicting success in medical school is not known, and further investigation is underway.
NASA Astrophysics Data System (ADS)
Leka, K. D.; Barnes, G.
2003-10-01
We apply statistical tests based on discriminant analysis to the wide range of photospheric magnetic parameters described in a companion paper by Leka & Barnes, with the goal of identifying those properties that are important for the production of energetic events such as solar flares. The photospheric vector magnetic field data from the University of Hawai'i Imaging Vector Magnetograph are well sampled both temporally and spatially, and we include here data covering 24 flare-event and flare-quiet epochs taken from seven active regions. The mean value and rate of change of each magnetic parameter are treated as separate variables, thus evaluating both the parameter's state and its evolution, to determine which properties are associated with flaring. Considering single variables first, Hotelling's T2-tests show small statistical differences between flare-producing and flare-quiet epochs. Even pairs of variables considered simultaneously, which do show a statistical difference for a number of properties, have high error rates, implying a large degree of overlap of the samples. To better distinguish between flare-producing and flare-quiet populations, larger numbers of variables are simultaneously considered; lower error rates result, but no unique combination of variables is clearly the best discriminator. The sample size is too small to directly compare the predictive power of large numbers of variables simultaneously. Instead, we rank all possible four-variable permutations based on Hotelling's T2-test and look for the most frequently appearing variables in the best permutations, with the interpretation that they are most likely to be associated with flaring. These variables include an increasing kurtosis of the twist parameter and a larger standard deviation of the twist parameter, but a smaller standard deviation of the distribution of the horizontal shear angle and a horizontal field that has a smaller standard deviation but a larger kurtosis. To support the ``sorting all permutations'' method of selecting the most frequently occurring variables, we show that the results of a single 10-variable discriminant analysis are consistent with the ranking. We demonstrate that individually, the variables considered here have little ability to differentiate between flaring and flare-quiet populations, but with multivariable combinations, the populations may be distinguished.
Soman, Gopalan; Yang, Xiaoyi; Jiang, Hengguang; Giardina, Steve; Vyas, Vinay; Mitra, George; Yovandich, Jason; Creekmore, Stephen P; Waldmann, Thomas A; Quiñones, Octavio; Alvord, W Gregory
2009-08-31
A colorimetric cell proliferation assay using soluble tetrazolium salt [(CellTiter 96(R) Aqueous One Solution) cell proliferation reagent, containing the (3-(4,5-dimethylthiazol-2-yl)-5-(3-carboxymethoxyphenyl)-2-(4-sulfophenyl)-2H-tetrazolium, inner salt) and an electron coupling reagent phenazine ethosulfate], was optimized and qualified for quantitative determination of IL-15 dependent CTLL-2 cell proliferation activity. An in-house recombinant Human (rHu)IL-15 reference lot was standardized (IU/mg) against an international reference standard. Specificity of the assay for IL-15 was documented by illustrating the ability of neutralizing anti-IL-15 antibodies to block the product specific CTLL-2 cell proliferation and the lack of blocking effect with anti-IL-2 antibodies. Under the defined assay conditions, the linear dose-response concentration range was between 0.04 and 0.17ng/ml of the rHuIL-15 produced in-house and 0.5-3.0IU/ml for the international standard. Statistical analysis of the data was performed with the use of scripts written in the R Statistical Language and Environment utilizing a four-parameter logistic regression fit analysis procedure. The overall variation in the ED(50) values for the in-house reference standard from 55 independent estimates performed over the period of 1year was 12.3% of the average. Excellent intra-plate and within-day/inter-plate consistency was observed for all four parameter estimates in the model. Different preparations of rHuIL-15 showed excellent intra-plate consistency in the parameter estimates corresponding to the lower and upper asymptotes as well as to the 'slope' factor at the mid-point. The ED(50) values showed statistically significant differences for different lots and for control versus stressed samples. Three R-scripts improve data analysis capabilities allowing one to describe assay variations, to draw inferences between data sets from formal statistical tests, and to set up improved assay acceptance criteria based on comparability and consistency in the four parameters of the model. The assay is precise, accurate and robust and can be fully validated. Applications of the assay were established including process development support, release of the rHuIL-15 product for pre-clinical and clinical studies, and for monitoring storage stability.
Sánchez, Guillermo; Nova, John; Arias, Nilsa; Peña, Bibiana
2008-12-01
The Fitzpatrick phototype scale has been used to determine skin sensitivity to ultraviolet light. The reliability of this scale in estimating sensitivity permits risk evaluation of skin cancer based on phototype. Reliability and changes in intra and inter-observer concordance was determined for the Fitzpatrick phototype scale after the assessment methods for establishing the phototype were standardized. An analytical study of intra and inter-observer concordance was performed. The Fitzpatrick phototype scale was standardized using focus group methodology. To determine intra and inter-observer agreement, the weighted kappa statistical method was applied. The standardization effect was measured using the equal kappa contrast hypothesis and Wald test for dependent measurements. The phototype scale was applied to 155 patients over 15 years of age who were assessed four times by two independent observers. The sample was drawn from patients of the Centro Dermatol6gico Federico Lleras Acosta. During the pre-standardization phase, the baseline and six-week inter-observer weighted kappa were 0.31 and 0.40, respectively. The intra-observer kappa values for observers A and B were 0.47 and 0.51, respectively. After the standardization process, the baseline and six-week inter-observer weighted kappa values were 0.77, and 0.82, respectively. Intra-observer kappa coefficients for observers A and B were 0.78 and 0.82. Statistically significant differences were found between coefficients before and after standardization (p<0.001) in all comparisons. Following a standardization exercise, the Fitzpatrick phototype scale yielded reliable, reproducible and consistent results.
A carcinogenic potency database of the standardized results of animal bioassays
Gold, Lois Swirsky; Sawyer, Charles B.; Magaw, Renae; Backman, Georganne M.; De Veciana, Margarita; Levinson, Robert; Hooper, N. Kim; Havender, William R.; Bernstein, Leslie; Peto, Richard; Pike, Malcolm C.; Ames, Bruce N.
1984-01-01
The preceding paper described our numerical index of carcinogenic potency, the TD50 and the statistical procedures adopted for estimating it from experimental data. This paper presents the Carcinogenic Potency Database, which includes results of about 3000 long-term, chronic experiments of 770 test compounds. Part II is a discussion of the sources of our data, the rationale for the inclusion of particular experiments and particular target sites, and the conventions adopted in summarizing the literature. Part III is a guide to the plot of results presented in Part IV. A number of appendices are provided to facilitate use of the database. The plot includes information about chronic cancer tests in mammals, such as dose and other aspects of experimental protocol, histopathology and tumor incidence, TD50 and its statistical significance, dose response, author's opinion and literature reference. The plot readily permits comparisons of carcinogenic potency and many other aspects of cancer tests; it also provides quantitative information about negative tests. The range of carcinogenic potency is over 10 million-fold. PMID:6525996
MATD Operational Phase: Experiences and Lessons Learned
NASA Astrophysics Data System (ADS)
Messidoro, P.; Bader, M.; Brunner, O.; Cerrato, A.; Sembenini, G.
2004-08-01
The Model And Test Effectiveness Database (MATD) initiative is ending the first year of its operational phase. MATD represents a common repository of project data, Assembly Integration and Verification (AIV) data, on ground and flight anomalies data, of recent space projects, and offers, with the application of specific methodologies, the possibility to analyze the collected data in order to improve the test philosophies and the related standards. Basically the following type of results can be derived from the database: - Statistics on ground failures and flight anomalies - Feed-back from the flight anomalies to the Test Philosophies - Test Effectiveness evaluation at system and lower levels - Estimate of the index of effectiveness of a specific Model and Test Philosophy in comparison with the applicable standards - Simulation of different Test philosophies and related balancing of Risk/cost/schedule on the basis of MATD data The paper after a short presentation of the status of the MATD initiative, summarises the most recent lessons learned which are resulting from the data analysis and highlights how MATD is being utilized for the actual risk/cost/schedule/Test effectiveness evaluations of the past programmes so as for the prediction of the new space projects.
Prodinger, Birgit; Ballert, Carolina S; Brach, Mirjam; Brinkhof, Martin W G; Cieza, Alarcos; Hug, Kerstin; Jordan, Xavier; Post, Marcel W M; Scheel-Sailer, Anke; Schubert, Martin; Tennant, Alan; Stucki, Gerold
2016-02-01
Functioning is an important outcome to measure in cohort studies. Clear and operational outcomes are needed to judge the quality of a cohort study. This paper outlines guiding principles for reporting functioning in cohort studies and addresses some outstanding issues. Principles of how to standardize reporting of data from a cohort study on functioning, by deriving scores that are most useful for further statistical analysis and reporting, are outlined. The Swiss Spinal Cord Injury Cohort Study Community Survey serves as a case in point to provide a practical application of these principles. Development of reporting scores must be conceptually coherent and metrically sound. The International Classification of Functioning, Disability and Health (ICF) can serve as the frame of reference for this, with its categories serving as reference units for reporting. To derive a score for further statistical analysis and reporting, items measuring a single latent trait must be invariant across groups. The Rasch measurement model is well suited to test these assumptions. Our approach is a valuable guide for researchers and clinicians, as it fosters comparability of data, strengthens the comprehensiveness of scope, and provides invariant, interval-scaled data for further statistical analyses of functioning.
He, Fu-yuan; Deng, Kai-wen; Huang, Sheng; Liu, Wen-long; Shi, Ji-lian
2013-09-01
The paper aims to elucidate and establish a new mathematic model: the total quantum statistical moment standard similarity (TQSMSS) on the base of the original total quantum statistical moment model and to illustrate the application of the model to medical theoretical research. The model was established combined with the statistical moment principle and the normal distribution probability density function properties, then validated and illustrated by the pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical method for them, and by analysis of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving the Buyanghanwu-decoction extract. The established model consists of four mainly parameters: (1) total quantum statistical moment similarity as ST, an overlapped area by two normal distribution probability density curves in conversion of the two TQSM parameters; (2) total variability as DT, a confidence limit of standard normal accumulation probability which is equal to the absolute difference value between the two normal accumulation probabilities within integration of their curve nodical; (3) total variable probability as 1-Ss, standard normal distribution probability within interval of D(T); (4) total variable probability (1-beta)alpha and (5) stable confident probability beta(1-alpha): the correct probability to make positive and negative conclusions under confident coefficient alpha. With the model, we had analyzed the TQSMS similarities of pharmacokinetics of three ingredients in Buyanghuanwu decoction and of three data analytical methods for them were at range of 0.3852-0.9875 that illuminated different pharmacokinetic behaviors of each other; and the TQSMS similarities (ST) of chromatographic fingerprint for various extracts with different solubility parameter solvents dissolving Buyanghuanwu-decoction-extract were at range of 0.6842-0.999 2 that showed different constituents with various solvent extracts. The TQSMSS can characterize the sample similarity, by which we can quantitate the correct probability with the test of power under to make positive and negative conclusions no matter the samples come from same population under confident coefficient a or not, by which we can realize an analysis at both macroscopic and microcosmic levels, as an important similar analytical method for medical theoretical research.
Thaung, Jörgen; Olseke, Kjell; Ahl, Johan; Sjöstrand, Johan
2014-09-01
The purpose of our study was to establish a practical and quick test for assessing reading performance and to statistically analyse interchart and test-retest reliability of a new standardized Swedish reading chart system consisting of three charts constructed according to the principles available in the literature. Twenty-four subjects with healthy eyes, mean age 65 ± 10 years, were tested binocularly and the reading performance evaluated as reading acuity, critical print size and maximum reading speed. The test charts all consist of 12 short text sentences with a print size ranging from 0.9 to -0.2 logMAR in approximate steps of 0.1 logMAR. Two testing sessions, in two different groups (C1 and C2), were under strict control of luminance and lighting environment. Reading performance tests with chart T1, T2 and T3 were used for evaluation of interchart reliability and test data from a second session 1 month or more apart for the test-retest analysis. The testing of reading performance in adult observers with short sentences of continuous text was quick and practical. The agreement between the tests obtained with the three different test charts was high both within the same test session and at retest. This new Swedish variant of a standardized reading system based on short sentences and logarithmic progression of print size provides reliable measurements of reading performance and preliminary norms in an age group around 65 years. The reading test with three independent reading charts can be useful for clinical studies of reading ability before and after treatment. © 2013 Acta Ophthalmologica Scandinavica Foundation. Published by John Wiley & Sons Ltd.
NASA Occupant Protection Standards Development
NASA Technical Reports Server (NTRS)
Somers, Jeffrey T.; Gernhardt, Michael A.; Lawrence, Charles
2011-01-01
Current National Aeronautics and Space Administration (NASA) occupant protection standards and requirements are based on extrapolations of biodynamic models, which were based on human tests performed under pre-Space Shuttle human flight programs where the occupants were in different suit and seat configurations than is expected for the Multi Purpose Crew Vehicle (MPCV) and Commercial Crew programs. As a result, there is limited statistical validity to the occupant protection standards. Furthermore, the current standards and requirements have not been validated in relevant spaceflight suit, seat configurations or loading conditions. The objectives of this study were to develop new standards and requirements for occupant protection and rigorously validate these new standards with sub-injurious human testing. To accomplish these objectives we began by determining which critical injuries NASA would like to protect for. We then defined the anthropomorphic test device (ATD) and the associated injury metrics of interest. Finally, we conducted a literature review of available data for the Test Device for Human Occupant Restraint New Technology (THOR-NT) ATD to determine injury assessment reference values (IARV) to serve as a baseline for further development. To better understand NASA s environment, we propose conducting sub-injurious human testing in spaceflight seat and suit configurations with spaceflight dynamic loads, with a sufficiently high number of subjects to validate no injury during nominal landing loads. In addition to validate nominal loads, the THOR-NT ATD will be tested in the same conditions as the human volunteers, allowing correlation between human and ATD responses covering the Orion nominal landing environment and commercial vehicle expected nominal environments. All testing will be conducted without the suit and with the suit to ascertain the contribution of the suit to human and ATD responses. In addition to the testing campaign proposed, additional data analysis is proposed to mine existing human injury and response data from other sources, including military volunteer testing, automotive Crash Injury Research Engineering Network (CIREN), and IndyCar impact and injury data. These data sources can allow a better extrapolation of the ATD responses to off-nominal conditions above the nominal range that can safely be tested. These elements will be used to develop injury risk functions for each of the injury metrics measured from the ATD. These risk functions would serve as the basis for the NASA standards. Finally, we propose defining standard test methodology for evaluating future spacecraft designs against the IARVs, including developing a star-rating system to allow crew safety comparisons between vehicles.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-01-01
Aims A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R2), using R2 as the primary metric of assay agreement. However, the use of R2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. Methods We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Results Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. Conclusions The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. PMID:28747393
NASA Astrophysics Data System (ADS)
Soros, P.; Ponkham, K.; Ekkapim, S.
2018-01-01
This research aimed to: 1) compare the critical think and problem solving skills before and after learning using STEM Education plan, 2) compare student achievement before and after learning about force and laws of motion using STEM Education plan, and 3) the satisfaction of learning by using STEM Education. The sample used were 37 students from grade 10 at Borabu School, Borabu District, Mahasarakham Province, semester 2, Academic year 2016. Tools used in this study consist of: 1) STEM Education plan about the force and laws of motion for grade 10 students of 1 schemes with total of 14 hours, 2) The test of critical think and problem solving skills with multiple-choice type of 5 options and 2 option of 30 items, 3) achievement test on force and laws of motion with multiple-choice of 4 options of 30 items, 4) satisfaction learning with 5 Rating Scale of 20 items. The statistics used in data analysis were percentage, mean, standard deviation, and t-test (Dependent). The results showed that 1) The student with learning using STEM Education plan have score of critical think and problem solving skills on post-test higher than pre-test with statistically significant level .01. 2) The student with learning using STEM Education plan have achievement score on post-test higher than pre-test with statistically significant level of .01. 3) The student'level of satisfaction toward the learning by using STEM Education plan was at a high level (X ¯ = 4.51, S.D=0.56).
Neves, Frederico S; Vasconcelos, Taruska V; Campos, Paulo S F; Haiter-Neto, Francisco; Freitas, Deborah Q
2014-02-01
The aim of this study was to evaluate the effect of scan mode of the cone beam computed tomography (CBCT) in the preoperative dental implant measurements. Completely edentulous mandibles with entirely resorbed alveolar processes were selected for this study. Five regions were selected (incisor, canine, premolar, first molar, and second molar). The mandibles were scanned with Next Generation i-CAT CBCT unit (Imaging Sciences International, Inc, Hatfield, PA, USA) with half (180°) and full (360°) mode. Two oral radiologists performed vertical measurements in all selected regions; the measurements of half of the sample were repeated within an interval of 30 days. The mandibles were sectioned using an electrical saw in all evaluated regions to obtain the gold standard. The intraclass correlation coefficient was calculated for the intra- and interobserver agreement. Descriptive statistics were calculated as mean, median, and standard deviation. Wilcoxon signed rank test was used to determine the correlation between the measurements obtained in different scan mode with the gold standard. The significance level was 5%. The values of intra- and interobserver reproducibility indicated a strong agreement. In the dental implant measurements, except the bone height of the second molar region in full scan mode (P = 0.02), the Wilcoxon signed rank test did not show statistical significant difference with the gold standard (P > 0.05). Both modes provided real measures, necessary when performing implant planning; however, half scan mode uses smaller doses, following the principle of effectiveness. We believe that this method should be used because of the best dose-effect relationship and offer less risk to the patient. © 2012 John Wiley & Sons A/S.
Dynamic Modeling and Testing of MSRR-1 for Use in Microgravity Environments Analysis
NASA Technical Reports Server (NTRS)
Gattis, Christy; LaVerde, Bruce; Howell, Mike; Phelps, Lisa H. (Technical Monitor)
2001-01-01
Delicate microgravity science is unlikely to succeed on the International Space Station if vibratory and transient disturbers corrupt the environment. An analytical approach to compute the on-orbit acceleration environment at science experiment locations within a standard payload rack resulting from these disturbers is presented. This approach has been grounded by correlation and comparison to test verified transfer functions. The method combines the results of finite element and statistical energy analysis using tested damping and modal characteristics to provide a reasonable approximation of the total root-mean-square (RMS) acceleration spectra at the interface to microgravity science experiment hardware.
Halogen and LED light curing of composite: temperature increase and Knoop hardness.
Schneider, L F; Consani, S; Correr-Sobrinho, L; Correr, A B; Sinhoreti, M A
2006-03-01
This study assessed the Knoop hardness and temperature increase provided by three light curing units when using (1) the manufacturers' recommended times of photo-activation and (2) standardizing total energy density. One halogen--XL2500 (3M/ESPE)--and two light-emitting diode (LED) curing units--Freelight (3M/ESPE) and Ultrablue IS (DMC)--were used. A type-K thermocouple registered the temperature change produced by the composite photo-activation in a mold. Twenty-four hours after the photo-activation procedures, the composite specimens were submitted to a hardness test. Both temperature increase and hardness data were submitted to ANOVA and Tukey's test (5% significance). Using the first set of photo-activation conditions, the halogen unit produced a statistically higher temperature increase than did both LED units, and the Freelight LED resulted in a lower hardness than did the other curing units. When applying the second set of photo-activation conditions, the two LED units produced statistically greater temperature increase than did the halogen unit, whereas there were no statistical differences in hardness among the curing units.
Inference of median difference based on the Box-Cox model in randomized clinical trials.
Maruo, K; Isogawa, N; Gosho, M
2015-05-10
In randomized clinical trials, many medical and biological measurements are not normally distributed and are often skewed. The Box-Cox transformation is a powerful procedure for comparing two treatment groups for skewed continuous variables in terms of a statistical test. However, it is difficult to directly estimate and interpret the location difference between the two groups on the original scale of the measurement. We propose a helpful method that infers the difference of the treatment effect on the original scale in a more easily interpretable form. We also provide statistical analysis packages that consistently include an estimate of the treatment effect, covariance adjustments, standard errors, and statistical hypothesis tests. The simulation study that focuses on randomized parallel group clinical trials with two treatment groups indicates that the performance of the proposed method is equivalent to or better than that of the existing non-parametric approaches in terms of the type-I error rate and power. We illustrate our method with cluster of differentiation 4 data in an acquired immune deficiency syndrome clinical trial. Copyright © 2015 John Wiley & Sons, Ltd.