Sample records for validation test cases

  1. A Human Proximity Operations System test case validation approach

    NASA Astrophysics Data System (ADS)

    Huber, Justin; Straub, Jeremy

    A Human Proximity Operations System (HPOS) poses numerous risks in a real world environment. These risks range from mundane tasks such as avoiding walls and fixed obstacles to the critical need to keep people and processes safe in the context of the HPOS's situation-specific decision making. Validating the performance of an HPOS, which must operate in a real-world environment, is an ill posed problem due to the complexity that is introduced by erratic (non-computer) actors. In order to prove the HPOS's usefulness, test cases must be generated to simulate possible actions of these actors, so the HPOS can be shown to be able perform safely in environments where it will be operated. The HPOS must demonstrate its ability to be as safe as a human, across a wide range of foreseeable circumstances. This paper evaluates the use of test cases to validate HPOS performance and utility. It considers an HPOS's safe performance in the context of a common human activity, moving through a crowded corridor, and extrapolates (based on this) to the suitability of using test cases for AI validation in other areas of prospective application.

  2. Validating an artificial intelligence human proximity operations system with test cases

    NASA Astrophysics Data System (ADS)

    Huber, Justin; Straub, Jeremy

    2013-05-01

    An artificial intelligence-controlled robot (AICR) operating in close proximity to humans poses risk to these humans. Validating the performance of an AICR is an ill posed problem, due to the complexity introduced by the erratic (noncomputer) actors. In order to prove the AICR's usefulness, test cases must be generated to simulate the actions of these actors. This paper discusses AICR's performance validation in the context of a common human activity, moving through a crowded corridor, using test cases created by an AI use case producer. This test is a two-dimensional simplification relevant to autonomous UAV navigation in the national airspace.

  3. Predictive validity of the Biomedical Admissions Test: an evaluation and case study.

    PubMed

    McManus, I C; Ferguson, Eamonn; Wakeford, Richard; Powis, David; James, David

    2011-01-01

    There has been an increase in the use of pre-admission selection tests for medicine. Such tests need to show good psychometric properties. Here, we use a paper by Emery and Bell [2009. The predictive validity of the Biomedical Admissions Test for pre-clinical examination performance. Med Educ 43:557-564] as a case study to evaluate and comment on the reporting of psychometric data in the field of medical student selection (and the comments apply to many papers in the field). We highlight pitfalls when reliability data are not presented, how simple zero-order associations can lead to inaccurate conclusions about the predictive validity of a test, and how biases need to be explored and reported. We show with BMAT that it is the knowledge part of the test which does all the predictive work. We show that without evidence of incremental validity it is difficult to assess the value of any selection tests for medicine.

  4. Effort, symptom validity testing, performance validity testing and traumatic brain injury.

    PubMed

    Bigler, Erin D

    2014-01-01

    To understand the neurocognitive effects of brain injury, valid neuropsychological test findings are paramount. This review examines the research on what has been referred to a symptom validity testing (SVT). Above a designated cut-score signifies a 'passing' SVT performance which is likely the best indicator of valid neuropsychological test findings. Likewise, substantially below cut-point performance that nears chance or is at chance signifies invalid test performance. Significantly below chance is the sine qua non neuropsychological indicator for malingering. However, the interpretative problems with SVT performance below the cut-point yet far above chance are substantial, as pointed out in this review. This intermediate, border-zone performance on SVT measures is where substantial interpretative challenges exist. Case studies are used to highlight the many areas where additional research is needed. Historical perspectives are reviewed along with the neurobiology of effort. Reasons why performance validity testing (PVT) may be better than the SVT term are reviewed. Advances in neuroimaging techniques may be key in better understanding the meaning of border zone SVT failure. The review demonstrates the problems with rigidity in interpretation with established cut-scores. A better understanding of how certain types of neurological, neuropsychiatric and/or even test conditions may affect SVT performance is needed.

  5. Five Data Validation Cases

    ERIC Educational Resources Information Center

    Simkin, Mark G.

    2008-01-01

    Data-validation routines enable computer applications to test data to ensure their accuracy, completeness, and conformance to industry or proprietary standards. This paper presents five programming cases that require students to validate five different types of data: (1) simple user data entries, (2) UPC codes, (3) passwords, (4) ISBN numbers, and…

  6. Alternative Vocabularies in the Test Validity Literature

    ERIC Educational Resources Information Center

    Markus, Keith A.

    2016-01-01

    Justification of testing practice involves moving from one state of knowledge about the test to another. Theories of test validity can (a) focus on the beginning of the process, (b) focus on the end, or (c) encompass the entire process. Analyses of four case studies test and illustrate three claims: (a) restrictions on validity entail a supplement…

  7. A case of misdiagnosis of mild cognitive impairment: The utility of symptom validity testing in an outpatient memory clinic.

    PubMed

    Roor, Jeroen J; Dandachi-FitzGerald, Brechje; Ponds, Rudolf W H M

    2016-01-01

    Noncredible symptom reports hinder the diagnostic process. This fact is especially the case for medical conditions that rely on subjective report of symptoms instead of objective measures. Mild cognitive impairment (MCI) primarily relies on subjective report, which makes it potentially susceptible to erroneous diagnosis. In this case report, we describe a 59-year-old female patient diagnosed with MCI 10 years previously. The patient was referred to the neurology department for reexamination by her general practitioner because of cognitive complaints and persistent fatigue. This case study used information from the medical file, a new magnetic resonance imaging brain scan, and neuropsychological assessment. Current neuropsychological assessment, including symptom validity tests, clearly indicated noncredible test performance, thereby invalidating the obtained neuropsychological test data. We conclude that a blind spot for noncredible symptom reports existed in the previous diagnostic assessments. This case highlights the usefulness of formal symptom validity testing in the diagnostic assessment of MCI.

  8. On Validity Theory and Test Validation

    ERIC Educational Resources Information Center

    Sireci, Stephen G.

    2007-01-01

    Lissitz and Samuelsen (2007) propose a new framework for conceptualizing test validity that separates analysis of test properties from analysis of the construct measured. In response, the author of this article reviews fundamental characteristics of test validity, drawing largely from seminal writings as well as from the accepted standards. He…

  9. Broadband Fan Noise Prediction System for Turbofan Engines. Volume 3; Validation and Test Cases

    NASA Technical Reports Server (NTRS)

    Morin, Bruce L.

    2010-01-01

    Pratt & Whitney has developed a Broadband Fan Noise Prediction System (BFaNS) for turbofan engines. This system computes the noise generated by turbulence impinging on the leading edges of the fan and fan exit guide vane, and noise generated by boundary-layer turbulence passing over the fan trailing edge. BFaNS has been validated on three fan rigs that were tested during the NASA Advanced Subsonic Technology Program (AST). The predicted noise spectra agreed well with measured data. The predicted effects of fan speed, vane count, and vane sweep also agreed well with measurements. The noise prediction system consists of two computer programs: Setup_BFaNS and BFaNS. Setup_BFaNS converts user-specified geometry and flow-field information into a BFaNS input file. From this input file, BFaNS computes the inlet and aft broadband sound power spectra generated by the fan and FEGV. The output file from BFaNS contains the inlet, aft and total sound power spectra from each noise source. This report is the third volume of a three-volume set documenting the Broadband Fan Noise Prediction System: Volume 1: Setup_BFaNS User s Manual and Developer s Guide; Volume 2: BFaNS User s Manual and Developer s Guide; and Volume 3: Validation and Test Cases. The present volume begins with an overview of the Broadband Fan Noise Prediction System, followed by validation studies that were done on three fan rigs. It concludes with recommended improvements and additional studies for BFaNS.

  10. Validating a UAV artificial intelligence control system using an autonomous test case generator

    NASA Astrophysics Data System (ADS)

    Straub, Jeremy; Huber, Justin

    2013-05-01

    The validation of safety-critical applications, such as autonomous UAV operations in an environment which may include human actors, is an ill posed problem. To confidence in the autonomous control technology, numerous scenarios must be considered. This paper expands upon previous work, related to autonomous testing of robotic control algorithms in a two dimensional plane, to evaluate the suitability of similar techniques for validating artificial intelligence control in three dimensions, where a minimum level of airspeed must be maintained. The results of human-conducted testing are compared to this automated testing, in terms of error detection, speed and testing cost.

  11. Directed Design of Experiments for Validating Probability of Detection Capability of a Testing System

    NASA Technical Reports Server (NTRS)

    Generazio, Edward R. (Inventor)

    2012-01-01

    A method of validating a probability of detection (POD) testing system using directed design of experiments (DOE) includes recording an input data set of observed hit and miss or analog data for sample components as a function of size of a flaw in the components. The method also includes processing the input data set to generate an output data set having an optimal class width, assigning a case number to the output data set, and generating validation instructions based on the assigned case number. An apparatus includes a host machine for receiving the input data set from the testing system and an algorithm for executing DOE to validate the test system. The algorithm applies DOE to the input data set to determine a data set having an optimal class width, assigns a case number to that data set, and generates validation instructions based on the case number.

  12. Safety validation test equipment operation

    NASA Astrophysics Data System (ADS)

    Kurosaki, Tadaaki; Watanabe, Takashi

    1992-08-01

    An overview of the activities conducted on safety validation test equipment operation for materials used for NASA manned missions is presented. Safety validation tests, such as flammability, odor, offgassing, and so forth were conducted in accordance with NASA-NHB-8060.1C using test subjects common with those used by NASA, and the equipment used were qualified for their functions and performances in accordance with NASDA-CR-99124 'Safety Validation Test Qualification Procedures.' Test procedure systems were established by preparing 'Common Procedures for Safety Validation Test' as well as test procedures for flammability, offgassing, and odor tests. The test operation organization chaired by the General Manager of the Parts and Material Laboratory of NASDA (National Space Development Agency of Japan) was established, and the test leaders and operators in the organization were qualified in accordance with the specified procedures. One-hundred-one tests had been conducted so far by the Parts and Material Laboratory according to the request submitted by the manufacturers through the Space Station Group and the Safety and Product Assurance for Manned Systems Office.

  13. Validation Test Report For The CRWMS Analysis and Logistics Visually Interactive Model Calvin Version 3.0, 10074-Vtr-3.0-00

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    S. Gillespie

    2000-07-27

    This report describes the tests performed to validate the CRWMS ''Analysis and Logistics Visually Interactive'' Model (CALVIN) Version 3.0 (V3.0) computer code (STN: 10074-3.0-00). To validate the code, a series of test cases was developed in the CALVIN V3.0 Validation Test Plan (CRWMS M&O 1999a) that exercises the principal calculation models and options of CALVIN V3.0. Twenty-five test cases were developed: 18 logistics test cases and 7 cost test cases. These cases test the features of CALVIN in a sequential manner, so that the validation of each test case is used to demonstrate the accuracy of the input to subsequentmore » calculations. Where necessary, the test cases utilize reduced-size data tables to make the hand calculations used to verify the results more tractable, while still adequately testing the code's capabilities. Acceptance criteria, were established for the logistics and cost test cases in the Validation Test Plan (CRWMS M&O 1999a). The Logistics test cases were developed to test the following CALVIN calculation models: Spent nuclear fuel (SNF) and reactivity calculations; Options for altering reactor life; Adjustment of commercial SNF (CSNF) acceptance rates for fiscal year calculations and mid-year acceptance start; Fuel selection, transportation cask loading, and shipping to the Monitored Geologic Repository (MGR); Transportation cask shipping to and storage at an Interim Storage Facility (ISF); Reactor pool allocation options; and Disposal options at the MGR. Two types of cost test cases were developed: cases to validate the detailed transportation costs, and cases to validate the costs associated with the Civilian Radioactive Waste Management System (CRWMS) Management and Operating Contractor (M&O) and Regional Servicing Contractors (RSCs). For each test case, values calculated using Microsoft Excel 97 worksheets were compared to CALVIN V3.0 scenarios with the same input data and assumptions. All of the test case results compare with the

  14. Summary of EASM Turbulence Models in CFL3D With Validation Test Cases

    NASA Technical Reports Server (NTRS)

    Rumsey, Christopher L.; Gatski, Thomas B.

    2003-01-01

    This paper summarizes the Explicit Algebraic Stress Model in k-omega form (EASM-ko) and in k-epsilon form (EASM-ke) in the Reynolds-averaged Navier-Stokes code CFL3D. These models have been actively used over the last several years in CFL3D, and have undergone some minor modifications during that time. Details of the equations and method for coding the latest versions of the models are given, and numerous validation cases are presented. This paper serves as a validation archive for these models.

  15. A Framework for Testing Scientific Software: A Case Study of Testing Amsterdam Discrete Dipole Approximation Software

    NASA Astrophysics Data System (ADS)

    Shao, Hongbing

    Software testing with scientific software systems often suffers from test oracle problem, i.e., lack of test oracles. Amsterdam discrete dipole approximation code (ADDA) is a scientific software system that can be used to simulate light scattering of scatterers of various types. Testing of ADDA suffers from "test oracle problem". In this thesis work, I established a testing framework to test scientific software systems and evaluated this framework using ADDA as a case study. To test ADDA, I first used CMMIE code as the pseudo oracle to test ADDA in simulating light scattering of a homogeneous sphere scatterer. Comparable results were obtained between ADDA and CMMIE code. This validated ADDA for use with homogeneous sphere scatterers. Then I used experimental result obtained for light scattering of a homogeneous sphere to validate use of ADDA with sphere scatterers. ADDA produced light scattering simulation comparable to the experimentally measured result. This further validated the use of ADDA for simulating light scattering of sphere scatterers. Then I used metamorphic testing to generate test cases covering scatterers of various geometries, orientations, homogeneity or non-homogeneity. ADDA was tested under each of these test cases and all tests passed. The use of statistical analysis together with metamorphic testing is discussed as a future direction. In short, using ADDA as a case study, I established a testing framework, including use of pseudo oracles, experimental results and the metamorphic testing techniques to test scientific software systems that suffer from test oracle problems. Each of these techniques is necessary and contributes to the testing of the software under test.

  16. On the Validity of Useless Tests

    ERIC Educational Resources Information Center

    Sireci, Stephen G.

    2016-01-01

    A misconception exists that validity may refer only to the "interpretation" of test scores and not to the "uses" of those scores. The development and evolution of validity theory illustrate test score interpretation was a primary focus in the earliest days of modern testing, and that validating interpretations derived from test…

  17. Validity evidence based on test content.

    PubMed

    Sireci, Stephen; Faulkner-Bond, Molly

    2014-01-01

    Validity evidence based on test content is one of the five forms of validity evidence stipulated in the Standards for Educational and Psychological Testing developed by the American Educational Research Association, American Psychological Association, and National Council on Measurement in Education. In this paper, we describe the logic and theory underlying such evidence and describe traditional and modern methods for gathering and analyzing content validity data. A comprehensive review of the literature and of the aforementioned Standards is presented. For educational tests and other assessments targeting knowledge and skill possessed by examinees, validity evidence based on test content is necessary for building a validity argument to support the use of a test for a particular purpose. By following the methods described in this article, practitioners have a wide arsenal of tools available for determining how well the content of an assessment is congruent with and appropriate for the specific testing purposes.

  18. Design of an Axisymmetric Afterbody Test Case for CFD Validation

    NASA Technical Reports Server (NTRS)

    Disotell, Kevin J.; Rumsey, Christopher L.

    2017-01-01

    As identified in the CFD Vision 2030 Study commissioned by NASA, validation of advanced RANS models and scale-resolving methods for computing turbulent flow fields must be supported by continuous improvements in fundamental, high-fidelity experiments designed specifically for CFD implementation. In accordance with this effort, the underpinnings of a new test platform referred to herein as the NASA Axisymmetric Afterbody are presented. The devised body-of-revolution is a modular platform consisting of a forebody section and afterbody section, allowing for a range of flow behaviors to be studied on interchangeable afterbody geometries. A body-of-revolution offers advantages in shape definition and fabrication, in avoiding direct contact with wind tunnel sidewalls, and in tail-sting integration to facilitate access to higher Reynolds number tunnels. The current work is focused on validation of smooth-body turbulent flow separation, for which a six-parameter body has been developed. A priori RANS computations are reported for a risk-reduction test configuration in order to demonstrate critical variation among turbulence model results for a given afterbody, ranging from barely-attached to mild separated flow. RANS studies of the effects of forebody nose (with/without) and wind tunnel boundary (slip/no-slip) on the selected afterbody are presented. Representative modeling issues that can be explored with this configuration are the effect of higher Reynolds number on separation behavior, flow physics of the progression from attached to increasingly-separated afterbody flows, and the effect of embedded longitudinal vortices on turbulence structure.

  19. Test Cases for Modeling and Validation of Structures with Piezoelectric Actuators

    NASA Technical Reports Server (NTRS)

    Reaves, Mercedes C.; Horta, Lucas G.

    2001-01-01

    A set of benchmark test articles were developed to validate techniques for modeling structures containing piezoelectric actuators using commercially available finite element analysis packages. The paper presents the development, modeling, and testing of two structures: an aluminum plate with surface mounted patch actuators and a composite box beam with surface mounted actuators. Three approaches for modeling structures containing piezoelectric actuators using the commercially available packages: MSC/NASTRAN and ANSYS are presented. The approaches, applications, and limitations are discussed. Data for both test articles are compared in terms of frequency response functions from deflection and strain data to input voltage to the actuator. Frequency response function results using the three different analysis approaches provided comparable test/analysis results. It is shown that global versus local behavior of the analytical model and test article must be considered when comparing different approaches. Also, improper bonding of actuators greatly reduces the electrical to mechanical effectiveness of the actuators producing anti-resonance errors.

  20. Evaluation of surveillance case definition in the diagnosis of leptospirosis, using the Microscopic Agglutination Test: a validation study.

    PubMed

    Dassanayake, Dinesh L B; Wimalaratna, Harith; Agampodi, Suneth B; Liyanapathirana, Veranja C; Piyarathna, Thibbotumunuwe A C L; Goonapienuwala, Bimba L

    2009-04-22

    Leptospirosis is endemic in both urban and rural areas of Sri Lanka and there had been many out breaks in the recent past. This study was aimed at validating the leptospirosis surveillance case definition, using the Microscopic Agglutination Test (MAT). The study population consisted of patients with undiagnosed acute febrile illness who were admitted to the medical wards of the Teaching Hospital Kandy, from 1st July 2007 to 31st July 2008. The subjects were screened to diagnose leptospirosis according to the leptospirosis case definition. MAT was performed on blood samples taken from each patient on the 7th day of fever. Leptospirosis case definition was evaluated in regard to sensitivity, specificity and predictive values, using a MAT titre >or= 1:800 for confirming leptospirosis. A total of 123 patients were initially recruited of which 73 had clinical features compatible with the surveillance case definition. Out of the 73 only 57 had a positive MAT result (true positives) leaving 16 as false positives. Out of the 50 who didn't have clinical features compatible with the case definition 45 had a negative MAT as well (true negatives), therefore 5 were false negatives. Total number of MAT positives was 62 out of 123. According to these results the test sensitivity was 91.94%, specificity 73.77%, positive predictive value and negative predictive values were 78.08% and 90% respectively. Diagnostic accuracy of the test was 82.93%. This study confirms that the surveillance case definition has a very high sensitivity and negative predictive value with an average specificity in diagnosing leptospirosis, based on a MAT titre of >or= 1: 800.

  1. Validation of the Hwalek-Sengstock Elder Abuse Screening Test.

    ERIC Educational Resources Information Center

    Neale, Anne Victoria; And Others

    Elder abuse is recognized as an under-detected and under-reported social problem. Difficulties in detecting elder abuse are compounded by the lack of a standardized, psychometrically valid instrument for case finding. The development of the Hwalek-Sengstock Elder Abuse Screening Test (H-S/EAST) followed a larger effort to identify indicators and…

  2. Validation of alternative methods for toxicity testing.

    PubMed Central

    Bruner, L H; Carr, G J; Curren, R D; Chamberlain, M

    1998-01-01

    Before nonanimal toxicity tests may be officially accepted by regulatory agencies, it is generally agreed that the validity of the new methods must be demonstrated in an independent, scientifically sound validation program. Validation has been defined as the demonstration of the reliability and relevance of a test method for a particular purpose. This paper provides a brief review of the development of the theoretical aspects of the validation process and updates current thinking about objectively testing the performance of an alternative method in a validation study. Validation of alternative methods for eye irritation testing is a specific example illustrating important concepts. Although discussion focuses on the validation of alternative methods intended to replace current in vivo toxicity tests, the procedures can be used to assess the performance of alternative methods intended for other uses. Images Figure 1 PMID:9599695

  3. Validation of a Videoconferenced Speaking Test

    ERIC Educational Resources Information Center

    Kim, Jungtae; Craig, Daniel A.

    2012-01-01

    Videoconferencing offers new opportunities for language testers to assess speaking ability in low-stakes diagnostic tests. To be considered a trusted testing tool in language testing, a test should be examined employing appropriate validation processes [Chapelle, C.A., Jamieson, J., & Hegelheimer, V. (2003). "Validation of a web-based ESL…

  4. 10 CFR 26.131 - Cutoff levels for validity screening and initial validity tests.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 1 2010-01-01 2010-01-01 false Cutoff levels for validity screening and initial validity tests. 26.131 Section 26.131 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.131 Cutoff levels for validity screening and initial validity tests. (a) Each...

  5. 10 CFR 26.131 - Cutoff levels for validity screening and initial validity tests.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 Energy 1 2011-01-01 2011-01-01 false Cutoff levels for validity screening and initial validity tests. 26.131 Section 26.131 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.131 Cutoff levels for validity screening and initial validity tests. (a) Each...

  6. Validity and Reliability of Baseline Testing in a Standardized Environment.

    PubMed

    Higgins, Kathryn L; Caze, Todd; Maerlender, Arthur

    2017-08-11

    The Immediate Postconcussion Assessment and Cognitive Testing (ImPACT) is a computerized neuropsychological test battery commonly used to determine cognitive recovery from concussion based on comparing post-injury scores to baseline scores. This model is based on the premise that ImPACT baseline test scores are a valid and reliable measure of optimal cognitive function at baseline. Growing evidence suggests that this premise may not be accurate and a large contributor to invalid and unreliable baseline test scores may be the protocol and environment in which baseline tests are administered. This study examined the effects of a standardized environment and administration protocol on the reliability and performance validity of athletes' baseline test scores on ImPACT by comparing scores obtained in two different group-testing settings. Three hundred-sixty one Division 1 cohort-matched collegiate athletes' baseline data were assessed using a variety of indicators of potential performance invalidity; internal reliability was also examined. Thirty-one to thirty-nine percent of the baseline cases had at least one indicator of low performance validity, but there were no significant differences in validity indicators based on environment in which the testing was conducted. Internal consistency reliability scores were in the acceptable to good range, with no significant differences between administration conditions. These results suggest that athletes may be reliably performing at levels lower than their best effort would produce. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Testing and validating environmental models

    USGS Publications Warehouse

    Kirchner, J.W.; Hooper, R.P.; Kendall, C.; Neal, C.; Leavesley, G.

    1996-01-01

    Generally accepted standards for testing and validating ecosystem models would benefit both modellers and model users. Universally applicable test procedures are difficult to prescribe, given the diversity of modelling approaches and the many uses for models. However, the generally accepted scientific principles of documentation and disclosure provide a useful framework for devising general standards for model evaluation. Adequately documenting model tests requires explicit performance criteria, and explicit benchmarks against which model performance is compared. A model's validity, reliability, and accuracy can be most meaningfully judged by explicit comparison against the available alternatives. In contrast, current practice is often characterized by vague, subjective claims that model predictions show 'acceptable' agreement with data; such claims provide little basis for choosing among alternative models. Strict model tests (those that invalid models are unlikely to pass) are the only ones capable of convincing rational skeptics that a model is probably valid. However, 'false positive' rates as low as 10% can substantially erode the power of validation tests, making them insufficiently strict to convince rational skeptics. Validation tests are often undermined by excessive parameter calibration and overuse of ad hoc model features. Tests are often also divorced from the conditions under which a model will be used, particularly when it is designed to forecast beyond the range of historical experience. In such situations, data from laboratory and field manipulation experiments can provide particularly effective tests, because one can create experimental conditions quite different from historical data, and because experimental data can provide a more precisely defined 'target' for the model to hit. We present a simple demonstration showing that the two most common methods for comparing model predictions to environmental time series (plotting model time series

  8. ExEP yield modeling tool and validation test results

    NASA Astrophysics Data System (ADS)

    Morgan, Rhonda; Turmon, Michael; Delacroix, Christian; Savransky, Dmitry; Garrett, Daniel; Lowrance, Patrick; Liu, Xiang Cate; Nunez, Paul

    2017-09-01

    EXOSIMS is an open-source simulation tool for parametric modeling of the detection yield and characterization of exoplanets. EXOSIMS has been adopted by the Exoplanet Exploration Programs Standards Definition and Evaluation Team (ExSDET) as a common mechanism for comparison of exoplanet mission concept studies. To ensure trustworthiness of the tool, we developed a validation test plan that leverages the Python-language unit-test framework, utilizes integration tests for selected module interactions, and performs end-to-end crossvalidation with other yield tools. This paper presents the test methods and results, with the physics-based tests such as photometry and integration time calculation treated in detail and the functional tests treated summarily. The test case utilized a 4m unobscured telescope with an idealized coronagraph and an exoplanet population from the IPAC radial velocity (RV) exoplanet catalog. The known RV planets were set at quadrature to allow deterministic validation of the calculation of physical parameters, such as working angle, photon counts and integration time. The observing keepout region was tested by generating plots and movies of the targets and the keepout zone over a year. Although the keepout integration test required the interpretation of a user, the test revealed problems in the L2 halo orbit and the parameterization of keepout applied to some solar system bodies, which the development team was able to address. The validation testing of EXOSIMS was performed iteratively with the developers of EXOSIMS and resulted in a more robust, stable, and trustworthy tool that the exoplanet community can use to simulate exoplanet direct-detection missions from probe class, to WFIRST, up to large mission concepts such as HabEx and LUVOIR.

  9. The validity of upper-limb neurodynamic tests for detecting peripheral neuropathic pain.

    PubMed

    Nee, Robert J; Jull, Gwendolen A; Vicenzino, Bill; Coppieters, Michel W

    2012-05-01

    The validity of upper-limb neurodynamic tests (ULNTs) for detecting peripheral neuropathic pain (PNP) was assessed by reviewing the evidence on plausibility, the definition of a positive test, reliability, and concurrent validity. Evidence was identified by a structured search for peer-reviewed articles published in English before May 2011. The quality of concurrent validity studies was assessed with the Quality Assessment of Diagnostic Accuracy Studies tool, where appropriate. Biomechanical and experimental pain data support the plausibility of ULNTs. Evidence suggests that a positive ULNT should at least partially reproduce the patient's symptoms and that structural differentiation should change these symptoms. Data indicate that this definition of a positive ULNT is reliable when used clinically. Limited evidence suggests that the median nerve test, but not the radial nerve test, helps determine whether a patient has cervical radiculopathy. The median nerve test does not help diagnose carpal tunnel syndrome. These findings should be interpreted cautiously, because diagnostic accuracy might have been distorted by the investigators' definitions of a positive ULNT. Furthermore, patients with PNP who presented with increased nerve mechanosensitivity rather than conduction loss might have been incorrectly classified by electrophysiological reference standards as not having PNP. The only evidence for concurrent validity of the ulnar nerve test was a case study on cubital tunnel syndrome. We recommend that researchers develop more comprehensive reference standards for PNP to accurately assess the concurrent validity of ULNTs and continue investigating the predictive validity of ULNTs for prognosis or treatment response.

  10. Content validity and reliability of test of gross motor development in Chilean children

    PubMed Central

    Cano-Cappellacci, Marcelo; Leyton, Fernanda Aleitte; Carreño, Joshua Durán

    2016-01-01

    ABSTRACT OBJECTIVE To validate a Spanish version of the Test of Gross Motor Development (TGMD-2) for the Chilean population. METHODS Descriptive, transversal, non-experimental validity and reliability study. Four translators, three experts and 92 Chilean children, from five to 10 years, students from a primary school in Santiago, Chile, have participated. The Committee of Experts has carried out translation, back-translation and revision processes to determine the translinguistic equivalence and content validity of the test, using the content validity index in 2013. In addition, a pilot implementation was achieved to determine test reliability in Spanish, by using the intraclass correlation coefficient and Bland-Altman method. We evaluated whether the results presented significant differences by replacing the bat with a racket, using T-test. RESULTS We obtained a content validity index higher than 0.80 for language clarity and relevance of the TGMD-2 for children. There were significant differences in the object control subtest when comparing the results with bat and racket. The intraclass correlation coefficient for reliability inter-rater, intra-rater and test-retest reliability was greater than 0.80 in all cases. CONCLUSIONS The TGMD-2 has appropriate content validity to be applied in the Chilean population. The reliability of this test is within the appropriate parameters and its use could be recommended in this population after the establishment of normative data, setting a further precedent for the validation in other Latin American countries. PMID:26815160

  11. Clinical Functional Capacity Testing in Patients With Facioscapulohumeral Muscular Dystrophy: Construct Validity and Interrater Reliability of Antigravity Tests.

    PubMed

    Rijken, Noortje H; van Engelen, Baziel G; Weerdesteyn, Vivian; Geurts, Alexander C

    2015-12-01

    To evaluate the construct validity and interrater reliability of 4 simple antigravity tests in a small group of patients with facioscapulohumeral muscular dystrophy (FSHD). Case-control study. University medical center. Patients with various severity levels of FSHD (n=9) and healthy control subjects (n=10) were included (N=19). Not applicable. A 4-point ordinal scale was designed to grade performance on the following 4 antigravity tests: sit to stance, stance to sit, step up, and step down. In addition, the 6-minute walk test, 10-m walking test, Berg Balance Scale, and timed Up and Go test were administered as conventional tests. Construct validity was determined by linear regression analysis using the Clinical Severity Score (CSS) as the dependent variable. Interrater agreement was tested using a κ analysis. Patients with FSHD performed worse on all 4 antigravity tests compared with the controls. Stronger correlations were found within than between test categories (antigravity vs conventional). The antigravity tests revealed the highest explained variance with regard to the CSS (R(2)=.86, P=.014). Interrater agreement was generally good. The results of this exploratory study support the construct validity and interrater reliability of the proposed antigravity tests for the assessment of functional capacity in patients with FSHD taking into account the use of compensatory strategies. Future research should further validate these results in a larger sample of patients with FSHD. Copyright © 2015 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  12. Acoustic-Structure Interaction in Rocket Engines: Validation Testing

    NASA Technical Reports Server (NTRS)

    Davis, R. Benjamin; Joji, Scott S.; Parks, Russel A.; Brown, Andrew M.

    2009-01-01

    While analyzing a rocket engine component, it is often necessary to account for any effects that adjacent fluids (e.g., liquid fuels or oxidizers) might have on the structural dynamics of the component. To better characterize the fully coupled fluid-structure system responses, an analytical approach that models the system as a coupled expansion of rigid wall acoustic modes and in vacuo structural modes has been proposed. The present work seeks to experimentally validate this approach. To experimentally observe well-coupled system modes, the test article and fluid cavities are designed such that the uncoupled structural frequencies are comparable to the uncoupled acoustic frequencies. The test measures the natural frequencies, mode shapes, and forced response of cylindrical test articles in contact with fluid-filled cylindrical and/or annular cavities. The test article is excited with a stinger and the fluid-loaded response is acquired using a laser-doppler vibrometer. The experimentally determined fluid-loaded natural frequencies are compared directly to the results of the analytical model. Due to the geometric configuration of the test article, the analytical model is found to be valid for natural modes with circumferential wave numbers greater than four. In the case of these modes, the natural frequencies predicted by the analytical model demonstrate excellent agreement with the experimentally determined natural frequencies.

  13. Initial Teacher Licensure Testing in Tennessee: Test Validation.

    ERIC Educational Resources Information Center

    Bowman, Harry L.; Petry, John R.

    In 1988 a study was conducted to determine the validity of candidate teacher licensure examinations for use in Tennessee under the 1984 Comprehensive Education Reform Act. The Department of Education conducted a study to determine the validity of 11 previously unvalidated or extensively revised tests for certification and to make recommendations…

  14. Evaluating Test Validity: Reprise and Progress

    ERIC Educational Resources Information Center

    Shepard, Lorrie A.

    2016-01-01

    The AERA, APA, NCME Standards define validity as "the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests". A century of disagreement about validity does not mean that there has not been substantial progress. This consensus definition brings together interpretations and use so that it…

  15. Construct Validity of Neuropsychological Tests in Schizophrenia.

    ERIC Educational Resources Information Center

    Allen, Daniel N.; Aldarondo, Felito; Goldstein, Gerald; Huegel, Stephen G.; Gilbertson, Mark; van Kammen, Daniel P.

    1998-01-01

    The construct validity of neuropsychological tests in patients with schizophrenia was studied with 39 patients who were evaluated with a battery of six tests assessing attention, memory, and abstract reasoning abilities. Results support the construct validity of the neuropsychological tests in patients with schizophrenia. (SLD)

  16. Valid methods: the quality assurance of test method development, validation, approval, and transfer for veterinary testing laboratories.

    PubMed

    Wiegers, Ann L

    2003-07-01

    Third-party accreditation is a valuable tool to demonstrate a laboratory's competence to conduct testing. Accreditation, internationally and in the United States, has been discussed previously. However, accreditation is only I part of establishing data credibility. A validated test method is the first component of a valid measurement system. Validation is defined as confirmation by examination and the provision of objective evidence that the particular requirements for a specific intended use are fulfilled. The international and national standard ISO/IEC 17025 recognizes the importance of validated methods and requires that laboratory-developed methods or methods adopted by the laboratory be appropriate for the intended use. Validated methods are therefore required and their use agreed to by the client (i.e., end users of the test results such as veterinarians, animal health programs, and owners). ISO/IEC 17025 also requires that the introduction of methods developed by the laboratory for its own use be a planned activity conducted by qualified personnel with adequate resources. This article discusses considerations and recommendations for the conduct of veterinary diagnostic test method development, validation, evaluation, approval, and transfer to the user laboratory in the ISO/IEC 17025 environment. These recommendations are based on those of nationally and internationally accepted standards and guidelines, as well as those of reputable and experienced technical bodies. They are also based on the author's experience in the evaluation of method development and transfer projects, validation data, and the implementation of quality management systems in the area of method development.

  17. Urine specimen validity test for drug abuse testing in workplace and court settings.

    PubMed

    Lin, Shin-Yu; Lee, Hei-Hwa; Lee, Jong-Feng; Chen, Bai-Hsiun

    2018-01-01

    In recent decades, urine drug testing in the workplace has become common in many countries in the world. There have been several studies concerning the use of the urine specimen validity test (SVT) for drug abuse testing administered in the workplace. However, very little data exists concerning the urine SVT on drug abuse tests from court specimens, including dilute, substituted, adulterated, and invalid tests. We investigated 21,696 submitted urine drug test samples for SVT from workplace and court settings in southern Taiwan over 5 years. All immunoassay screen-positive urine specimen drug tests were confirmed by gas chromatography/mass spectrometry. We found that the mean 5-year prevalence of tampering (dilute, substituted, or invalid tests) in urine specimens from the workplace and court settings were 1.09% and 3.81%, respectively. The mean 5-year percentage of dilute, substituted, and invalid urine specimens from the workplace were 89.2%, 6.8%, and 4.1%, respectively. The mean 5-year percentage of dilute, substituted, and invalid urine specimens from the court were 94.8%, 1.4%, and 3.8%, respectively. No adulterated cases were found among the workplace or court samples. The most common drug identified from the workplace specimens was amphetamine, followed by opiates. The most common drug identified from the court specimens was ketamine, followed by amphetamine. We suggest that all urine specimens taken for drug testing from both the workplace and court settings need to be tested for validity. Copyright © 2017. Published by Elsevier B.V.

  18. Alphabus Mechanical Validation Plan and Test Campaign

    NASA Astrophysics Data System (ADS)

    Calvisi, G.; Bonnet, D.; Belliol, P.; Lodereau, P.; Redoundo, R.

    2012-07-01

    A joint team of the two leading European satellite companies (Astrium and Thales Alenia Space) worked with the support of ESA and CNES to define a product line able to efficiently address the upper segment of communications satellites : Alphabus Starting in 2009 and up to 2011 the mechanical validation of the Alphabus platform has been obtained thanks to static tests performed on dedicated static model and to environmental test performed on the first satellite based on Alphabus: Alphasat I-XL. The mechanical validation of the Alphabus platform presented an excellent opportunity to improve the validation and qualification process, with respect to static, sine vibrations, acoustic and L/V shock environment, minimizing recurrent cost of manufacturing, integration and testing. A main driver on mechanical testing is that mechanical acceptance testing at satellite level will be performed with empty tanks due to technical constraints (limitation of existing vibration devices) and programmatic advantages (test risk reduction, test schedule minimization). In this paper the impacts that such testing logic have on validation plan are briefly recalled and its actual application for Alphasat PFM mechanical test campaign is detailed.

  19. A Note on Economic Content and Test Validity.

    ERIC Educational Resources Information Center

    Soper, John C.; Brenneke, Judith Staley

    1987-01-01

    Offers practical tips on how teachers can determine whether classroom tests are actually measuring what they are designed to measure. Discusses criterion-related validity, construct validity, and content validity. Demonstrates how to determine the degree of content validity a particular test may have for a particular course or unit. (Author/DH)

  20. Validation of the Lollipop Test: A Diagnostic Screening Test of School Readiness.

    ERIC Educational Resources Information Center

    Chew, Alex L.; Morris, John D.

    1984-01-01

    The validity of the Lollipop Test: A Diagnostic Screening Test of School Readiness was examined using the Metropolitan Readiness Test (MRT), Level I, Form Q, as the criterion. Appreciable concurrent validity was found across test batteries. Implications for school readiness screening are discussed. (Author/BS)

  1. Performance validity testing in neuropsychology: a clinical guide, critical review, and update on a rapidly evolving literature.

    PubMed

    Lippa, Sara M

    2018-04-01

    Over the past two decades, there has been much research on measures of response bias and myriad measures have been validated in a variety of clinical and research samples. This critical review aims to guide clinicians through the use of performance validity tests (PVTs) from test selection and administration through test interpretation and feedback. Recommended cutoffs and relevant test operating characteristics are presented. Other important issues to consider during test selection, administration, interpretation, and feedback are discussed including order effects, coaching, impact on test data, and methods to combine measures and improve predictive power. When interpreting performance validity measures, neuropsychologists must use particular caution in cases of dementia, low intelligence, English as a second language/minority cultures, or low education. PVTs provide valuable information regarding response bias and, under the right circumstances, can provide excellent evidence of response bias. Only after consideration of the entire clinical picture, including validity test performance, can concrete determinations regarding the validity of test data be made.

  2. Development of an Axisymmetric Afterbody Test Case for Turbulent Flow Separation Validation

    NASA Technical Reports Server (NTRS)

    Disotell, Kevin J.; Rumsey, Christopher L.

    2017-01-01

    As identified in the CFD Vision 2030 Study commissioned by NASA, validation of advanced RANS models and scale-resolving methods for computing turbulent flows must be supported by improvements in high-quality experiments designed specifically for CFD implementation. A new test platform referred to as the Axisymmetric Afterbody allows for a range of flow behaviors to be studied on interchangeable afterbodies while facilitating access to higher Reynolds number facilities. A priori RANS computations are reported for a risk-reduction configuration to demonstrate critical variation among turbulence model results for a given afterbody, ranging from barely-attached to mild separated flow. The effects of body nose geometry and tunnel-wall boundary condition on the computed afterbody flow are explored to inform the design of an experimental test program.

  3. 14 CFR 91.1041 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 14 Aeronautics and Space 2 2014-01-01 2014-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

  4. 14 CFR 91.1041 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 14 Aeronautics and Space 2 2012-01-01 2012-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

  5. 14 CFR 91.1041 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 14 Aeronautics and Space 2 2013-01-01 2013-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

  6. 14 CFR 91.1041 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 14 Aeronautics and Space 2 2011-01-01 2011-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

  7. 14 CFR 91.1041 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 14 Aeronautics and Space 2 2010-01-01 2010-01-01 false Aircraft proving and validation tests. 91... Ownership Operations Program Management § 91.1041 Aircraft proving and validation tests. (a) No program... tests. However, pilot flight training may be conducted during the proving tests. (d) Validation testing...

  8. Radiant Energy Measurements from a Scaled Jet Engine Axisymmetric Exhaust Nozzle for a Baseline Code Validation Case

    NASA Technical Reports Server (NTRS)

    Baumeister, Joseph F.

    1994-01-01

    A non-flowing, electrically heated test rig was developed to verify computer codes that calculate radiant energy propagation from nozzle geometries that represent aircraft propulsion nozzle systems. Since there are a variety of analysis tools used to evaluate thermal radiation propagation from partially enclosed nozzle surfaces, an experimental benchmark test case was developed for code comparison. This paper briefly describes the nozzle test rig and the developed analytical nozzle geometry used to compare the experimental and predicted thermal radiation results. A major objective of this effort was to make available the experimental results and the analytical model in a format to facilitate conversion to existing computer code formats. For code validation purposes this nozzle geometry represents one validation case for one set of analysis conditions. Since each computer code has advantages and disadvantages based on scope, requirements, and desired accuracy, the usefulness of this single nozzle baseline validation case can be limited for some code comparisons.

  9. Evidence of Construct Validity in Published Achievement Tests.

    ERIC Educational Resources Information Center

    Nolet, Victor; Tindal, Gerald

    Valid interpretation of test scores is the shared responsibility of the test designer and the test user. Test publishers must provide evidence of the validity of the decisions their tests are intended to support, while test users are responsible for analyzing this evidence and subsequently using the test in the manner indicated by the publisher.…

  10. The influence of validity criteria on Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) test-retest reliability among high school athletes.

    PubMed

    Brett, Benjamin L; Solomon, Gary S

    2017-04-01

    Research findings to date on the stability of Immediate Post-Concussion Assessment and Cognitive Testing (ImPACT) Composite scores have been inconsistent, requiring further investigation. The use of test validity criteria across these studies also has been inconsistent. Using multiple measures of stability, we examined test-retest reliability of repeated ImPACT baseline assessments in high school athletes across various validity criteria reported in previous studies. A total of 1146 high school athletes completed baseline cognitive testing using the online ImPACT test battery at two time periods of approximately two-year intervals. No participant sustained a concussion between assessments. Five forms of validity criteria used in previous test-retest studies were applied to the data, and differences in reliability were compared. Intraclass correlation coefficients (ICCs) ranged in composite scores from .47 (95% confidence interval, CI [.38, .54]) to .83 (95% CI [.81, .85]) and showed little change across a two-year interval for all five sets of validity criteria. Regression based methods (RBMs) examining the test-retest stability demonstrated a lack of significant change in composite scores across the two-year interval for all forms of validity criteria, with no cases falling outside the expected range of 90% confidence intervals. The application of more stringent validity criteria does not alter test-retest reliability, nor does it account for some of the variation observed across previously performed studies. As such, use of the ImPACT manual validity criteria should be utilized in the determination of test validity and in the individualized approach to concussion management. Potential future efforts to improve test-retest reliability are discussed.

  11. Validity and reliability of the NAB Naming Test.

    PubMed

    Sachs, Bonnie C; Rush, Beth K; Pedraza, Otto

    2016-05-01

    Confrontation naming is commonly assessed in neuropsychological practice, but few standardized measures of naming exist and those that do are susceptible to the effects of education and culture. The Neuropsychological Assessment Battery (NAB) Naming Test is a 31-item measure used to assess confrontation naming. Despite adequate psychometric information provided by the test publisher, there has been limited independent validation of the test. In this study, we investigated the convergent and discriminant validity, internal consistency, and alternate forms reliability of the NAB Naming Test in a sample of adults (Form 1: n = 247, Form 2: n = 151) clinically referred for neuropsychological evaluation. Results indicate adequate-to-good internal consistency and alternate forms reliability. We also found strong convergent validity as demonstrated by relationships with other neurocognitive measures. We found preliminary evidence that the NAB Naming Test demonstrates a more pronounced ceiling effect than other commonly used measures of naming. To our knowledge, this represents the largest published independent validation study of the NAB Naming Test in a clinical sample. Our findings suggest that the NAB Naming Test demonstrates adequate validity and reliability and merits consideration in the test arsenal of clinical neuropsychologists.

  12. College Text Test Validity.

    ERIC Educational Resources Information Center

    McAfee, Donald C.

    1979-01-01

    A team of faculty members and graduate students identified major concepts and developed validated test questions for two widely used textbooks in personal hygiene classes in order to standardize norms for classes and supplement inadequate instructor's manuals. (JMF)

  13. Evaluating the Content Validity of Multistage-Adaptive Tests

    ERIC Educational Resources Information Center

    Crotts, Katrina; Sireci, Stephen G.; Zenisky, April

    2012-01-01

    Validity evidence based on test content is important for educational tests to demonstrate the degree to which they fulfill their purposes. Most content validity studies involve subject matter experts (SMEs) who rate items that comprise a test form. In computerized-adaptive testing, examinees take different sets of items and test "forms"…

  14. Testing and Validating Machine Learning Classifiers by Metamorphic Testing☆

    PubMed Central

    Xie, Xiaoyuan; Ho, Joshua W. K.; Murphy, Christian; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2011-01-01

    Machine Learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no “test oracle” to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique “metamorphic testing”, which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program. PMID:21532969

  15. Reliability and Validity in Hospital Case-Mix Measurement

    PubMed Central

    Pettengill, Julian; Vertrees, James

    1982-01-01

    There is widespread interest in the development of a measure of hospital output. This paper describes the problem of measuring the expected cost of the mix of inpatient cases treated in a hospital (hospital case-mix) and a general approach to its solution. The solution is based on a set of homogenous groups of patients, defined by a patient classification system, and a set of estimated relative cost weights corresponding to the patient categories. This approach is applied to develop a summary measure of the expected relative costliness of the mix of Medicare patients treated in 5,576 participating hospitals. The Medicare case-mix index is evaluated by estimating a hospital average cost function. This provides a direct test of the hypothesis that the relationship between Medicare case-mix and Medicare cost per case is proportional. The cost function analysis also provides a means of simulating the effects of classification error on our estimate of this relationship. Our results indicate that this general approach to measuring hospital case-mix provides a valid and robust measure of the expected cost of a hospital's case-mix. PMID:10309909

  16. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 15 Commerce and Foreign Trade 3 2013-01-01 2013-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  17. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 15 Commerce and Foreign Trade 3 2014-01-01 2014-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  18. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 15 Commerce and Foreign Trade 3 2012-01-01 2012-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  19. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 15 Commerce and Foreign Trade 3 2011-01-01 2011-01-01 false Format validation software testing... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying, as far as reasonable and practicable, that CEVAD's data testing software performs the checks, as...

  20. 15 CFR 995.27 - Format validation software testing.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 15 Commerce and Foreign Trade 3 2010-01-01 2010-01-01 false Format validation software testing... CERTIFICATION REQUIREMENTS FOR NOAA HYDROGRAPHIC PRODUCTS AND SERVICES CERTIFICATION REQUIREMENTS FOR... of NOAA ENC Products § 995.27 Format validation software testing. Tests shall be performed verifying...

  1. 14 CFR 135.145 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 14 Aeronautics and Space 3 2011-01-01 2011-01-01 false Aircraft proving and validation tests. 135... Aircraft and Equipment § 135.145 Aircraft proving and validation tests. (a) No certificate holder may...) Validation testing is required to determine that a certificate holder is capable of conducting operations...

  2. 14 CFR 135.145 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 14 Aeronautics and Space 3 2013-01-01 2013-01-01 false Aircraft proving and validation tests. 135... Aircraft and Equipment § 135.145 Aircraft proving and validation tests. (a) No certificate holder may...) Validation testing is required to determine that a certificate holder is capable of conducting operations...

  3. 14 CFR 135.145 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 14 Aeronautics and Space 3 2010-01-01 2010-01-01 false Aircraft proving and validation tests. 135... Aircraft and Equipment § 135.145 Aircraft proving and validation tests. (a) No certificate holder may...) Validation testing is required to determine that a certificate holder is capable of conducting operations...

  4. 14 CFR 135.145 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 14 Aeronautics and Space 3 2014-01-01 2014-01-01 false Aircraft proving and validation tests. 135... Aircraft and Equipment § 135.145 Aircraft proving and validation tests. (a) No certificate holder may...) Validation testing is required to determine that a certificate holder is capable of conducting operations...

  5. 14 CFR 135.145 - Aircraft proving and validation tests.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 14 Aeronautics and Space 3 2012-01-01 2012-01-01 false Aircraft proving and validation tests. 135... Aircraft and Equipment § 135.145 Aircraft proving and validation tests. (a) No certificate holder may...) Validation testing is required to determine that a certificate holder is capable of conducting operations...

  6. Minimizing false positive error with multiple performance validity tests: response to Bilder, Sugar, and Hellemann (2014 this issue).

    PubMed

    Larrabee, Glenn J

    2014-01-01

    Bilder, Sugar, and Hellemann (2014 this issue) contend that empirical support is lacking for use of multiple performance validity tests (PVTs) in evaluation of the individual case, differing from the conclusions of Davis and Millis (2014), and Larrabee (2014), who found no substantial increase in false positive rates using a criterion of failure of ≥ 2 PVTs and/or Symptom Validity Tests (SVTs) out of multiple tests administered. Reconsideration of data presented in Larrabee (2014) supports a criterion of ≥ 2 out of up to 7 PVTs/SVTs, as keeping false positive rates close to and in most cases below 10% in cases with bona fide neurologic, psychiatric, and developmental disorders. Strategies to minimize risk of false positive error are discussed, including (1) adjusting individual PVT cutoffs or criterion for number of PVTs failed, for examinees who have clinical histories placing them at risk for false positive identification (e.g., severe TBI, schizophrenia), (2) using the history of the individual case to rule out conditions known to result in false positive errors, (3) using normal performance in domains mimicked by PVTs to show that sufficient native ability exists for valid performance on the PVT(s) that have been failed, and (4) recognizing that as the number of PVTs/SVTs failed increases, the likelihood of valid clinical presentation decreases, with a corresponding increase in the likelihood of invalid test performance and symptom report.

  7. Valid statistical inference methods for a case-control study with missing data.

    PubMed

    Tian, Guo-Liang; Zhang, Chi; Jiang, Xuejun

    2018-04-01

    The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case-control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case-control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case-control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.

  8. Construct Validity of the Nepalese School Leaving English Reading Test

    ERIC Educational Resources Information Center

    Dawadi, Saraswati; Shrestha, Prithvi N.

    2018-01-01

    There has been a steady interest in investigating the validity of language tests in the last decades. Despite numerous studies on construct validity in language testing, there are not many studies examining the construct validity of a reading test. This paper reports on a study that explored the construct validity of the English reading test in…

  9. Test Takers and the Validity of Score Interpretations

    ERIC Educational Resources Information Center

    Kopriva, Rebecca J.; Thurlow, Martha L.; Perie, Marianne; Lazarus, Sheryl S.; Clark, Amy

    2016-01-01

    This article argues that test takers are as integral to determining validity of test scores as defining target content and conditioning inferences on test use. A principled sustained attention to how students interact with assessment opportunities is essential, as is a principled sustained evaluation of evidence confirming the validity or calling…

  10. Mars Ascent Vehicle Test Requirements and Terrestrial Validation

    NASA Technical Reports Server (NTRS)

    Dankanich, John W.; Cathey, Henry M.; Smith, David A.

    2011-01-01

    The Mars robotic sample return mission has been a potential flagship mission for NASA s science mission directorate for decades. The Mars Exploration Program and the planetary science decadal survey have highlighted both the science return of the Mars Sample Return mission, but also the need for risk reduction through technology development. One of the critical elements of the MSR mission is the Mars Ascent Vehicle, which must launch the sample from the surface of Mars and place it into low Mars orbit. The MAV has significant challenges to overcome due to the Martian environments and the Entry Descent and Landing system constraints. Launch vehicles typically have a relatively low success probability for early flights, and a thorough system level validation is warranted. The MAV flight environments are challenging and in some cases impossible to replicate terrestrially. The expected MAV environments have been evaluated and a first look of potential system test options has been explored. The terrestrial flight requirements and potential validation options are presented herein.

  11. Automated smartphone audiometry: Validation of a word recognition test app.

    PubMed

    Dewyer, Nicholas A; Jiradejvong, Patpong; Henderson Sabes, Jennifer; Limb, Charles J

    2018-03-01

    Develop and validate an automated smartphone word recognition test. Cross-sectional case-control diagnostic test comparison. An automated word recognition test was developed as an app for a smartphone with earphones. English-speaking adults with recent audiograms and various levels of hearing loss were recruited from an audiology clinic and were administered the smartphone word recognition test. Word recognition scores determined by the smartphone app and the gold standard speech audiometry test performed by an audiologist were compared. Test scores for 37 ears were analyzed. Word recognition scores determined by the smartphone app and audiologist testing were in agreement, with 86% of the data points within a clinically acceptable margin of error and a linear correlation value between test scores of 0.89. The WordRec automated smartphone app accurately determines word recognition scores. 3b. Laryngoscope, 128:707-712, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.

  12. Predictive Validity Study of the APS Writing and Reading Tests [and] Validating Placement Rules for the APS Writing Test.

    ERIC Educational Resources Information Center

    College of the Canyons, Valencia, CA. Office of Institutional Development.

    California's College of the Canyons has used the College Board Assessment and Placement Services (APS) test to assess students' abilities in basic and college English since spring 1993. These two reports summarize data from a May 1994 study of the predictive validity of the APS writing and reading tests and a June 1994 effort to validate the cut…

  13. Validating use of a critical thinking test for the dental admission test.

    PubMed

    Tsai, Tsung-Hsun

    2014-04-01

    The purpose of this study was to validate the use of a test to assess dental school applicants' critical thinking abilities. The intent was to include this test on the Dental Admission Test (DAT) if it was shown to enhance the DAT's validity. Correlation and regression analyses of undergraduate and dental school performance with scores on each of the tests on the DAT battery and the California Critical Thinking Skills Test (CCTST) were performed. Data were collected from 439 third- and fourth-year dental students who consented to participate and were enrolled at one of the ten accredited dental schools included in the study. These ten dental schools were from most regions of the United States. This study concluded that including the CCTST on the DAT did not significantly enhance the DAT's validity.

  14. Herbalife hepatotoxicity: Evaluation of cases with positive reexposure tests.

    PubMed

    Teschke, Rolf; Frenzel, Christian; Schulze, Johannes; Schwarzenboeck, Alexander; Eickhoff, Axel

    2013-07-27

    To analyze the validity of applied test criteria and causality assessment methods in assumed Herbalife hepatotoxicity with positive reexposure tests. We searched the Medline database for suspected cases of Herbalife hepatotoxicity and retrieved 53 cases including eight cases with a positive unintentional reexposure and a high causality level for Herbalife. First, analysis of these eight cases focused on the data quality of the positive reexposure cases, requiring a baseline value of alanine aminotransferase (ALT) < 5 upper limit of normal (N) before reexposure, with N as the upper limit of normal, and a doubling of the ALT value at reexposure as compared to the ALT value at baseline prior to reexposure. Second, reported methods to assess causality in the eight cases were evaluated, and then the liver specific Council for International Organizations of Medical Sciences (CIOMS) scale validated for hepatotoxicity cases was used for quantitative causality reevaluation. This scale consists of various specific elements with scores provided through the respective case data, and the sum of the scores yields a causality grading for each individual case of initially suspected hepatotoxicity. Details of positive reexposure test conditions and their individual results were scattered in virtually all cases, since reexposures were unintentional and allowed only retrospective rather than prospective assessments. In 1/8 cases, criteria for a positive reexposure were fulfilled, whereas in the remaining cases the reexposure test was classified as negative (n = 1), or the data were considered as uninterpretable due to missing information to comply adequately with the criteria (n = 6). In virtually all assessed cases, liver unspecific causality assessment methods were applied rather than a liver specific method such as the CIOMS scale. Using this scale, causality gradings for Herbalife in these eight cases were probable (n = 1), unlikely (n = 4), and excluded (n = 3). Confounding

  15. Herbalife hepatotoxicity: Evaluation of cases with positive reexposure tests

    PubMed Central

    Teschke, Rolf; Frenzel, Christian; Schulze, Johannes; Schwarzenboeck, Alexander; Eickhoff, Axel

    2013-01-01

    AIM: To analyze the validity of applied test criteria and causality assessment methods in assumed Herbalife hepatotoxicity with positive reexposure tests. METHODS: We searched the Medline database for suspected cases of Herbalife hepatotoxicity and retrieved 53 cases including eight cases with a positive unintentional reexposure and a high causality level for Herbalife. First, analysis of these eight cases focused on the data quality of the positive reexposure cases, requiring a baseline value of alanine aminotransferase (ALT) < 5 upper limit of normal (N) before reexposure, with N as the upper limit of normal, and a doubling of the ALT value at reexposure as compared to the ALT value at baseline prior to reexposure. Second, reported methods to assess causality in the eight cases were evaluated, and then the liver specific Council for International Organizations of Medical Sciences (CIOMS) scale validated for hepatotoxicity cases was used for quantitative causality reevaluation. This scale consists of various specific elements with scores provided through the respective case data, and the sum of the scores yields a causality grading for each individual case of initially suspected hepatotoxicity. RESULTS: Details of positive reexposure test conditions and their individual results were scattered in virtually all cases, since reexposures were unintentional and allowed only retrospective rather than prospective assessments. In 1/8 cases, criteria for a positive reexposure were fulfilled, whereas in the remaining cases the reexposure test was classified as negative (n = 1), or the data were considered as uninterpretable due to missing information to comply adequately with the criteria (n = 6). In virtually all assessed cases, liver unspecific causality assessment methods were applied rather than a liver specific method such as the CIOMS scale. Using this scale, causality gradings for Herbalife in these eight cases were probable (n = 1), unlikely (n = 4), and excluded (n

  16. The Concurrent Validity of Four Tests of Metalinguistic Awareness.

    ERIC Educational Resources Information Center

    Day, Kaaren C.; Day, H. D.

    1991-01-01

    Examines the concurrent validity of four metalinguistic awareness tests (Written Language Awareness Test, Test of Early Reading Ability, Linguistic Awareness in Reading Readiness Test, and the Concepts about Print Test). Finds rather low concurrent validity coefficients which suggests that further work is needed to clarify the operations required…

  17. Cross-cultural adaptation and validation of the sino-nasal outcome test (SNOT-22) for Spanish-speaking patients.

    PubMed

    de los Santos, Gonzalo; Reyes, Pablo; del Castillo, Raúl; Fragola, Claudio; Royuela, Ana

    2015-11-01

    Our objective was to perform translation, cross-cultural adaptation and validation of the sino-nasal outcome test 22 (SNOT-22) to Spanish language. SNOT-22 was translated, back translated, and a pretest trial was performed. The study included 119 individuals divided into 60 cases, who met diagnostic criteria for chronic rhinosinusitis according to the European Position Paper on Rhinosinusitis 2012; and 59 controls, who reported no sino-nasal disease. Internal consistency was evaluated with Cronbach's alpha test, reproducibility with Kappa coefficient, reliability with intraclass correlation coefficient (ICC), validity with Mann-Whitney U test and responsiveness with Wilcoxon test. In cases, Cronbach's alpha was 0.91 both before and after treatment, as for controls, it was 0.90 at their first test assessment and 0.88 at 3 weeks. Kappa coefficient was calculated for each item, with an average score of 0.69. ICC was also performed for each item, with a score of 0.87 in the overall score and an average among all items of 0.71. Median score for cases was 47, and 2 for controls, finding the difference to be highly significant (Mann-Whitney U test, p < 0.001). Clinical changes were observed among treated patients, with a median score of 47 and 13.5 before and after treatment, respectively (Wilcoxon test, p < 0.001). The effect size resulted in 0.14 in treated patients whose status at 3 weeks was unvarying; 1.03 in those who were better and 1.89 for much better group. All controls were unvarying with an effect size of 0.05. The Spanish version of the SNOT-22 has the internal consistency, reliability, reproducibility, validity and responsiveness necessary to be a valid instrument to be used in clinical practice.

  18. Validation studies and proficiency testing.

    PubMed

    Ankilam, Elke; Heinze, Petra; Kay, Simon; Van den Eede, Guy; Popping, Bert

    2002-01-01

    Genetically modified organisms (GMOs) entered the European food market in 1996. Current legislation demands the labeling of food products if they contain <1% GMO, as assessed for each ingredient of the product. To create confidence in the testing methods and to complement enforcement requirements, there is an urgent need for internationally validated methods, which could serve as reference methods. To date, several methods have been submitted to validation trials at an international level; approaches now exist that can be used in different circumstances and for different food matrixes. Moreover, the requirement for the formal validation of methods is clearly accepted; several national and international bodies are active in organizing studies. Further validation studies, especially on the quantitative polymerase chain reaction methods, need to be performed to cover the rising demand for new extraction methods and other background matrixes, as well as for novel GMO constructs.

  19. 40 CFR 86.1341-90 - Test cycle validation criteria.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 19 2011-07-01 2011-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...

  20. 40 CFR 86.1341-90 - Test cycle validation criteria.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 20 2013-07-01 2013-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...

  1. 40 CFR 86.1341-90 - Test cycle validation criteria.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 20 2012-07-01 2012-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag... brake horsepower-hour. (c) Regression line analysis to calculate validation statistics. (1) Linear...

  2. [Validation of three screening tests used for early detection of cervical cancer].

    PubMed

    Rodriguez-Reyes, Esperanza Rosalba; Cerda-Flores, Ricardo M; Quiñones-Pérez, Juan M; Cortés-Gutiérrez, Elva I

    2008-01-01

    to evaluate the validity (sensitivity, specificity, and accuracy) of three screening methods used in the early detection of the cervical carcinoma versus the histopathology diagnosis. a selected sample of 107 women attended in the Opportune Detection of Cervicouterine Cancer Program in the Hospital de Zona 46, Instituto Mexicano del Seguro Social in Durango, during the 2003 was included. The application of Papa-nicolaou, acetic acid test, and molecular detection of human papillomavirus, and histopatholgy diagnosis were performed in all the patients at the time of the gynecological exam. The detection and tipification of the human papillomavirus was performed by polymerase chain reaction (PCR) and analysis of polymorphisms of length of restriction fragments (RFLP). Histopathology diagnosis was considered the gold standard. The evaluation of the validity was carried out by the Bayesian method for diagnosis test. the positive cases for acetic acid test, Papanicolaou, and PCR were 47, 22, and 19. The accuracy values were 0.70, 0.80 and 0.99, respectively. since the molecular method showed a greater validity in the early detection of the cervical carcinoma we considered of vital importance its implementation in suitable programs of Opportune Detection of Cervicouterino Cancer Program in Mexico. However, in order to validate this conclusion, cross-sectional studies in different region of country must be carried out.

  3. 40 CFR 86.1341-98 - Test cycle validation criteria.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 20 2012-07-01 2012-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-98 Test cycle validation criteria. Section 86.1341-98 includes text that specifies...-90 (d)(4), shall be excluded from both cycle validation and the integrated work used for emissions...

  4. 40 CFR 86.1341-98 - Test cycle validation criteria.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 20 2013-07-01 2013-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-98 Test cycle validation criteria. Section 86.1341-98 includes text that specifies...-90 (d)(4), shall be excluded from both cycle validation and the integrated work used for emissions...

  5. 40 CFR 86.1341-98 - Test cycle validation criteria.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 19 2011-07-01 2011-07-01 false Test cycle validation criteria. 86... Procedures § 86.1341-98 Test cycle validation criteria. Section 86.1341-98 includes text that specifies...-90 (d)(4), shall be excluded from both cycle validation and the integrated work used for emissions...

  6. Students' Initial Knowledge State and Test Design: Towards a Valid and Reliable Test Instrument

    ERIC Educational Resources Information Center

    CoPo, Antonio Roland I.

    2015-01-01

    Designing a good test instrument involves specifications, test construction, validation, try-out, analysis and revision. The initial knowledge state of forty (40) tertiary students enrolled in Business Statistics course was determined and the same test instrument undergoes validation. The designed test instrument did not only reveal the baseline…

  7. Validation of a Case Definition for Pediatric Brain Injury Using Administrative Data.

    PubMed

    McChesney-Corbeil, Jane; Barlow, Karen; Quan, Hude; Chen, Guanmin; Wiebe, Samuel; Jette, Nathalie

    2017-03-01

    Health administrative data are a common population-based data source for traumatic brain injury (TBI) surveillance and research; however, before using these data for surveillance, it is important to develop a validated case definition. The objective of this study was to identify the optimal International Classification of Disease , edition 10 (ICD-10), case definition to ascertain children with TBI in emergency room (ER) or hospital administrative data. We tested multiple case definitions. Children who visited the ER were identified from the Regional Emergency Department Information System at Alberta Children's Hospital. Secondary data were collected for children with trauma, musculoskeletal, or central nervous system complaints who visited the ER between October 5, 2005, and June 6, 2007. TBI status was determined based on chart review. Sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for each case definition. Of 6639 patients, 1343 had a TBI. The best case definition was, "1 hospital or 1 ER encounter coded with an ICD-10 code for TBI in 1 year" (sensitivity 69.8% [95% confidence interval (CI), 67.3-72.2], specificity 96.7% [95% CI, 96.2-97.2], PPV 84.2% [95% CI 82.0-86.3], NPV 92.7% [95% CI, 92.0-93.3]). The nonspecific code S09.9 identified >80% of TBI cases in our study. The optimal ICD-10-based case definition for pediatric TBI in this study is valid and should be considered for future pediatric TBI surveillance studies. However, external validation is recommended before use in other jurisdictions, particularly because it is plausible that a larger proportion of patients in our cohort had milder injuries.

  8. Predictive value of autoantibody testing for validating self-reported diagnoses of rheumatoid arthritis in the Women's Health Initiative.

    PubMed

    Walitt, Brian; Mackey, Rachel; Kuller, Lewis; Deane, Kevin D; Robinson, William; Holers, V Michael; Chang, Yue-Fang; Moreland, Larry

    2013-05-01

    Rheumatoid arthritis (RA) research using large databases is limited by insufficient case validity. Of 161,808 postmenopausal women in the Women's Health Initiative, 15,691 (10.2%) reported having RA, far higher than the expected 1% population prevalence. Since chart review for confirmation of an RA diagnosis is impractical in large cohort studies, the current study (2009-2011) tested the ability of baseline serum measurements of rheumatoid factor and anti-cyclic citrullinated peptide antibodies, second-generation assay (anti-CCP2), to identify physician-validated RA among the chart-review study participants with self-reported RA (n = 286). Anti-CCP2 positivity had the highest positive predictive value (PPV) (80.0%), and rheumatoid factor positivity the lowest (44.6%). Together, use of disease-modifying antirheumatic drugs and anti-CCP2 positivity increased PPV to 100% but excluded all seronegative cases (approximately 15% of all RA cases). Case definitions inclusive of seronegative cases had PPVs between 59.6% and 63.6%. False-negative results were minimized in these test definitions, as evidenced by negative predictive values of approximately 90%. Serological measurements, particularly measurement of anti-CCP2, improved the test characteristics of RA case definitions in the Women's Health Initiative.

  9. Validity and Acceptance of Color Vision Testing on Smartphones.

    PubMed

    Ozgur, Omar K; Emborgo, Trisha S; Vieyra, Mark B; Huselid, Rebecca F; Banik, Rudrani

    2018-03-01

    Ishihara color plates (ICP) are the most commonly used color vision test (CVT) worldwide. With the advent of new technologies, attempts have been made to streamline the process of CVT. As hardware and software evolve, smartphone-based testing modalities may aid ophthalmologists in performing more efficient ophthalmic examinations. We assess the validity of smartphone color vision testing (CVT) by comparing results using the Eye Handbook (EHB) CVT application with standard Ishihara color plates (ICP). Prospective case-control study of subjects 18 years and older with visual acuity of 20/100 or better at 14 inches. The study group included patients with any ocular pathology. The color vision deficient (CVD) group was patients who failed more than 2 plates. The control group had no known ocular pathology. CVT was performed with both ICP and EHB under standardized background illuminance. Eleven plates were tested with each modality. Validity of EHB CVT and acceptance of EHB CVT were analyzed. Statistical analyses were performed using Bland-Altman plot with limits of agreement (LOA) at the 95th percentile of differences in score, independent samples t tests with 95% confidence interval (CI), and Pearson χ tests. The Bland-Altman plot showed agreement between correct number of plates in EHB and ICP for the study subjects (bias, -0.25; LOA, -1.92 to 1.42). Agreement was also observed between the correct number of plates in EHB and ICP for the controls (bias, -0.01; LOA, -0.61 to 0.59) and CVD (bias, -0.50; LOA, -4.64 to 3.64) subjects. The sensitivity of EHB was 0.92 (95% CI 0.76-1.07) and the specificity of EHB was 1.00 (95% CI 1.00-1.00). Fifty-nine percent preferred EHB, 12% preferred ICP, and 29% had no preference. In healthy controls and patients with ocular pathology, there was an agreement of CVT results comparing EHB with ICP. Overall, the majority preferred EHB to ICP. These findings demonstrate that further testing is required to understand and improve the

  10. The Teenage Nonviolence Test: Concurrent and Discriminant Validity.

    ERIC Educational Resources Information Center

    Konen, Kristopher; Mayton, Daniel M., II; Delva, Zenita; Sonnen, Melinda; Dahl, William; Montgomery, Richard

    This study was designed to document the validity of the Teenage Nonviolence Test (TNT). In this study the concurrent validity of the TNT in various ways, the validity of the TNT using known groups, and the discriminant validity of the TNT by evaluating its relationships with other psychological constructs were assessed. The results showed that the…

  11. Dynamic testing in schizophrenia: does training change the construct validity of a test?

    PubMed

    Wiedl, Karl H; Schöttke, Henning; Green, Michael F; Nuechterlein, Keith H

    2004-01-01

    Dynamic testing typically involves specific interventions for a test to assess the extent to which test performance can be modified, beyond level of baseline (static) performance. This study used a dynamic version of the Wisconsin Card Sorting Test (WCST) that is based on cognitive remediation techniques within a test-training-test procedure. From results of previous studies with schizophrenia patients, we concluded that the dynamic and static versions of the WCST should have different construct validity. This hypothesis was tested by examining the patterns of correlations with measures of executive functioning, secondary verbal memory, and verbal intelligence. Results demonstrated a specific construct validity of WCST dynamic (i.e., posttest) scores as an index of problem solving (Tower of Hanoi) and secondary verbal memory and learning (Auditory Verbal Learning Test), whereas the impact of general verbal capacity and selective attention (Verbal IQ, Stroop Test) was reduced. It is concluded that the construct validity of the test changes with dynamic administration and that this difference helps to explain why the dynamic version of the WCST predicts functional outcome better than the static version.

  12. Embedded performance validity testing in neuropsychological assessment: Potential clinical tools.

    PubMed

    Rickards, Tyler A; Cranston, Christopher C; Touradji, Pegah; Bechtold, Kathleen T

    2018-01-01

    The article aims to suggest clinically-useful tools in neuropsychological assessment for efficient use of embedded measures of performance validity. To accomplish this, we integrated available validity-related and statistical research from the literature, consensus statements, and survey-based data from practicing neuropsychologists. We provide recommendations for use of 1) Cutoffs for embedded performance validity tests including Reliable Digit Span, California Verbal Learning Test (Second Edition) Forced Choice Recognition, Rey-Osterrieth Complex Figure Test Combination Score, Wisconsin Card Sorting Test Failure to Maintain Set, and the Finger Tapping Test; 2) Selecting number of performance validity measures to administer in an assessment; and 3) Hypothetical clinical decision-making models for use of performance validity testing in a neuropsychological assessment collectively considering behavior, patient reporting, and data indicating invalid or noncredible performance. Performance validity testing helps inform the clinician about an individual's general approach to tasks: response to failure, task engagement and persistence, compliance with task demands. Data-driven clinical suggestions provide a resource to clinicians and to instigate conversation within the field to make more uniform, testable decisions to further the discussion, and guide future research in this area.

  13. 40 CFR 86.1341-98 - Test cycle validation criteria.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 19 2010-07-01 2010-07-01 false Test cycle validation criteria. 86...) Emission Regulations for New Otto-Cycle and Diesel Heavy-Duty Engines; Gaseous and Particulate Exhaust Test Procedures § 86.1341-98 Test cycle validation criteria. Section 86.1341-98 includes text that specifies...

  14. Methodology for testing and validating knowledge bases

    NASA Technical Reports Server (NTRS)

    Krishnamurthy, C.; Padalkar, S.; Sztipanovits, J.; Purves, B. R.

    1987-01-01

    A test and validation toolset developed for artificial intelligence programs is described. The basic premises of this method are: (1) knowledge bases have a strongly declarative character and represent mostly structural information about different domains, (2) the conditions for integrity, consistency, and correctness can be transformed into structural properties of knowledge bases, and (3) structural information and structural properties can be uniformly represented by graphs and checked by graph algorithms. The interactive test and validation environment have been implemented on a SUN workstation.

  15. 40 CFR 86.1341-90 - Test cycle validation criteria.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 19 2010-07-01 2010-07-01 false Test cycle validation criteria. 86...) Emission Regulations for New Otto-Cycle and Diesel Heavy-Duty Engines; Gaseous and Particulate Exhaust Test Procedures § 86.1341-90 Test cycle validation criteria. (a) To minimize the biasing effect of the time lag...

  16. Validating a Spanish Developmental Spelling Test.

    ERIC Educational Resources Information Center

    Ferroli, Lou; Krajenta, Marilyn

    The creation and validation of a Spanish version of an English developmental spelling test (DST) is described. An introductory section reviews related literature on the rationale for and construction of DSTs, spelling development in the early grades, and Spanish-English bilingual education. Differences between the English and Spanish test versions…

  17. Validation of a diabetes numeracy test in Arabic.

    PubMed

    Alghodaier, Hussah; Jradi, Hoda; Mohammad, Najwa Samantha; Bawazir, Amen

    2017-01-01

    The prevalence of diabetes Mellitus in Saudi Arabia is 24%, ranking it among the top ten Worldwide. Diabetes education focuses on self-management and relies on numeracy skills. Poor numeracy may go unrecognized and it is important to have an assessment tool in Arabic to measure such a skill in diabetes care. To validate a 15-item Diabetes Numeracy Test (DNT-15) in the Arabic Language as a tool to assess the numeracy skills of patients with diabetes and to test its properties among Saudi patients with diabetes. A 15-question Arabic-language test to assess diabetes numeracy among patients with diabetes on the basis of the diabetes numeracy test (DNT-15) was validated among a sample Arabic speaking Saudi patients with diabetes. Data collection included patients' demographics, long-term glycemic control, diabetes type, duration, co-morbidities, and diabetes related knowledge questions. Internal reliability was assessed using Kuder-Richardson Formula 20 (KR-20). The average score of Arabic DNT-15 was 53.3% and took an average of 30 minutes to complete. The scores significantly correlated with education, income, HbA1c, and diabetes knowledge (p<0.05). Content Validity Ratio (CVR) of 0.75 and Content Validity Index (CVI) of 0.89 supported good content validity. The Arabic DNT-15 also had good internal reliability (KR20 = 0.90). Patients with diabetes need numeracy skills to manage their disease. Level of education does not reflect level of numeracy, and low numeracy skills might be unnoticed by health care providers. The Arabic DNT-15 is a valid and reliable scale to identify Arabic speaking patients with difficulties in certain diabetes-related numeracy skills.

  18. Validation of the Sport Competition Anxiety Test.

    ERIC Educational Resources Information Center

    Cheatham, T.; Rosentswieg, J.

    1982-01-01

    Fifteen female varsity softball coaches were administered the Sport Competition Anxiety Test prior to competition. Their heart rates, continuously monitored by tilemetry, did not relate significantly to the anxiety test data. The test does not appear to be a valid measure of trait anxiety for women softball coaches. (Author/PN)

  19. Educational testing validity and reliability in pharmacy and medical education literature.

    PubMed

    Hoover, Matthew J; Jung, Rose; Jacobs, David M; Peeters, Michael J

    2013-12-16

    To evaluate and compare the reliability and validity of educational testing reported in pharmacy education journals to medical education literature. Descriptions of validity evidence sources (content, construct, criterion, and reliability) were extracted from articles that reported educational testing of learners' knowledge, skills, and/or abilities. Using educational testing, the findings of 108 pharmacy education articles were compared to the findings of 198 medical education articles. For pharmacy educational testing, 14 articles (13%) reported more than 1 validity evidence source while 83 articles (77%) reported 1 validity evidence source and 11 articles (10%) did not have evidence. Among validity evidence sources, content validity was reported most frequently. Compared with pharmacy education literature, more medical education articles reported both validity and reliability (59%; p<0.001). While there were more scholarship of teaching and learning (SoTL) articles in pharmacy education compared to medical education, validity, and reliability reporting were limited in the pharmacy education literature.

  20. Construct Validation of the Fairy Tale Test--Standardization Data.

    ERIC Educational Resources Information Center

    Coulacoglou, Carina

    2002-01-01

    Studied the construct validity of the Fairy Tale Test (C. Coulacoglu, 1993), a personality projective test for children, in a sample of 800 Greek children aged 8, 10, and 12. Factor analysis led to identification of eight primary factors, and correlations with other measures provide construct validity evidence. (SLD)

  1. Validating Test Score Meaning and Defending Test Score Use: Different Aims, Different Methods

    ERIC Educational Resources Information Center

    Cizek, Gregory J.

    2016-01-01

    Advances in validity theory and alacrity in validation practice have suffered because the term "validity" has been used to refer to two incompatible concerns: (1) the degree of support for specified interpretations of test scores (i.e. intended score meaning) and (2) the degree of support for specified applications (i.e. intended test…

  2. WEC-SIM Validation Testing Plan FY14 Q4.

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ruehl, Kelley Michelle

    2016-02-01

    The WEC-Sim project is currently on track, having met both the SNL and NREL FY14 Milestones, as shown in Table 1 and Table 2. This is also reflected in the Gantt chart uploaded to the WEC-Sim SharePoint site in the FY14 Q4 Deliverables folder. The work completed in FY14 includes code verification through code-to-code comparison (FY14 Q1 and Q2), preliminary code validation through comparison to experimental data (FY14 Q2 and Q3), presentation and publication of the WEC-Sim project at OMAE 2014 [1], [2], [3] and GMREC/METS 2014 [4] (FY14 Q3), WEC-Sim code development and public open-source release (FY14 Q3), andmore » development of a preliminary WEC-Sim validation test plan (FY14 Q4). This report presents the preliminary Validation Testing Plan developed in FY14 Q4. The validation test effort started in FY14 Q4 and will go on through FY15. Thus far the team has developed a device selection method, selected a device, and placed a contract with the testing facility, established several collaborations including industry contacts, and have working ideas on the testing details such as scaling, device design, and test conditions.« less

  3. Extended version of the "Sniffin' Sticks" identification test: test-retest reliability and validity.

    PubMed

    Sorokowska, A; Albrecht, E; Haehner, A; Hummel, T

    2015-03-30

    The extended, 32-item version of the Sniffin' Sticks identification test was developed in order to create a precise tool enabling repeated, longitudinal testing of individual olfactory subfunctions. Odors of the previous test version had to be changed for technical reasons, and the odor identification test needed re-investigation in terms of reliability, validity, and normative values. In our study we investigated olfactory abilities of a group of 100 patients with olfactory dysfunction and 100 controls. We reconfirmed the high test-retest reliability of the extended version of the Sniffin' Sticks identification test and high correlations between the new and the original part of this tool. In addition, we confirmed the validity of the test as it discriminated clearly between controls and patients with olfactory loss. The additional set of 16 odor identification sticks can be either included in the current olfactory test, thus creating a more detailed diagnosis tool, or it can be used separately, enabling to follow olfactory function over time. Additionally, the normative values presented in our paper might provide useful guidelines for interpretation of the extended identification test results. The revised version of the Sniffin' Sticks 32-item odor identification test is a reliable and valid tool for the assessment of olfactory function. Copyright © 2015 Elsevier B.V. All rights reserved.

  4. Experience with Aero- and Fluid-Dynamic Testing for Engineering and CFD Validation

    NASA Technical Reports Server (NTRS)

    Ross, James C.

    2016-01-01

    Ever since computations have been used to simulate aerodynamics the need to ensure that the computations adequately represent real life has followed. Many experiments have been performed specifically for validation and as computational methods have improved, so have the validation experiments. Validation is also a moving target because computational methods improve requiring validation for the new aspect of flow physics that the computations aim to capture. Concurrently, new measurement techniques are being developed that can help capture more detailed flow features pressure sensitive paint (PSP) and particle image velocimetry (PIV) come to mind. This paper will present various wind-tunnel tests the author has been involved with and how they were used for validation of various kinds of CFD. A particular focus is the application of advanced measurement techniques to flow fields (and geometries) that had proven to be difficult to predict computationally. Many of these difficult flow problems arose from engineering and development problems that needed to be solved for a particular vehicle or research program. In some cases the experiments required to solve the engineering problems were refined to provide valuable CFD validation data in addition to the primary engineering data. All of these experiments have provided physical insight and validation data for a wide range of aerodynamic and acoustic phenomena for vehicles ranging from tractor-trailers to crewed spacecraft.

  5. Coverage of the Test of Memory Malingering, Victoria Symptom Validity Test, and Word Memory Test on the Internet: is test security threatened?

    PubMed

    Bauer, Lyndsey; McCaffrey, Robert J

    2006-01-01

    In forensic neuropsychological settings, maintaining test security has become critically important, especially in regard to symptom validity tests (SVTs). Coaching, which can entail providing patients or litigants with information about the cognitive sequelae of head injury, or teaching them test-taking strategies to avoid detection of symptom dissimulation has been examined experimentally in many research studies. Emerging evidence supports that coaching strategies affect psychological and neuropsychological test performance to differing degrees depending on the coaching paradigm and the tests administered. The present study sought to examine Internet coverage of SVTs because it is potentially another source of coaching, or information that is readily available. Google searches were performed on the Test of Memory Malingering, the Victoria Symptom Validity Test, and the Word Memory Test. Results indicated that there is a variable amount of information available about each test that could threaten test security and validity should inappropriately interested parties find it. Steps that could be taken to improve this situation and limitations to this exploration are discussed.

  6. Development and Validation of Diagnostic Economics Test for Secondary Schools

    ERIC Educational Resources Information Center

    Eleje, Lydia I.; Esomonu, Nkechi P. M.; Agu, Ngozi N.; Okoye, Romy O.; Obasi, Emma; Onah, Frederick E.

    2016-01-01

    A diagnostic test in economics to aid the teachers determine student's specific weak content areas was developed and validated. Five research questions guided the study. Preliminary validation was done by two experienced teachers in the content area of secondary economics and two experts in test construction. The pilot testing was conducted for…

  7. 10 CFR 26.139 - Reporting initial validity and drug test results.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 10 Energy 1 2014-01-01 2014-01-01 false Reporting initial validity and drug test results. 26.139... § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall... permitted under § 26.75(h), positive test results from initial drug tests at the licensee testing facility...

  8. 10 CFR 26.139 - Reporting initial validity and drug test results.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 10 Energy 1 2012-01-01 2012-01-01 false Reporting initial validity and drug test results. 26.139... § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall... permitted under § 26.75(h), positive test results from initial drug tests at the licensee testing facility...

  9. Validation of Clinical Testing for Warfarin Sensitivity

    PubMed Central

    Langley, Michael R.; Booker, Jessica K.; Evans, James P.; McLeod, Howard L.; Weck, Karen E.

    2009-01-01

    Responses to warfarin (Coumadin) anticoagulation therapy are affected by genetic variability in both the CYP2C9 and VKORC1 genes. Validation of pharmacogenetic testing for warfarin responses includes demonstration of analytical validity of testing platforms and of the clinical validity of testing. We compared four platforms for determining the relevant single nucleotide polymorphisms (SNPs) in both CYP2C9 and VKORC1 that are associated with warfarin sensitivity (Third Wave Invader Plus, ParagonDx/Cepheid Smart Cycler, Idaho Technology LightCycler, and AutoGenomics Infiniti). Each method was examined for accuracy, cost, and turnaround time. All genotyping methods demonstrated greater than 95% accuracy for identifying the relevant SNPs (CYP2C9 *2 and *3; VKORC1 −1639 or 1173). The ParagonDx and Idaho Technology assays had the shortest turnaround and hands-on times. The Third Wave assay was readily scalable to higher test volumes but had the longest hands-on time. The AutoGenomics assay interrogated the largest number of SNPs but had the longest turnaround time. Four published warfarin-dosing algorithms (Washington University, UCSF, Louisville, and Newcastle) were compared for accuracy for predicting warfarin dose in a retrospective analysis of a local patient population on long-term, stable warfarin therapy. The predicted doses from both the Washington University and UCSF algorithms demonstrated the best correlation with actual warfarin doses. PMID:19324988

  10. Design and Testing of Braided Composite Fan Case Materials and Components

    NASA Technical Reports Server (NTRS)

    Roberts, Gary D.; Pereira, J. Michael; Braley, Michael S.; Arnold, William a.; Dorer, James D.; Watson, William R/.

    2009-01-01

    Triaxial braid composite materials are beginning to be used in fan cases for commercial gas turbine engines. The primary benefit for the use of composite materials is reduced weight and the associated reduction in fuel consumption. However, there are also cost benefits in some applications. This paper presents a description of the braided composite materials and discusses aspects of the braiding process that can be utilized for efficient fabrication of composite cases. The paper also presents an approach that was developed for evaluating the braided composite materials and composite fan cases in a ballistic impact laboratory. Impact of composite panels with a soft projectile is used for materials evaluation. Impact of composite fan cases with fan blades or blade-like projectiles is used to evaluate containment capability. A post-impact structural load test is used to evaluate the capability of the impacted fan case to survive dynamic loads during engine spool down. Validation of these new test methods is demonstrated by comparison with results of engine blade-out tests.

  11. Symptom validity testing in memory clinics: Hippocampal-memory associations and relevance for diagnosing mild cognitive impairment.

    PubMed

    Rienstra, Anne; Groot, Paul F C; Spaan, Pauline E J; Majoie, Charles B L M; Nederveen, Aart J; Walstra, Gerard J M; de Jonghe, Jos F M; van Gool, Willem A; Olabarriaga, Silvia D; Korkhov, Vladimir V; Schmand, Ben

    2013-01-01

    Patients with mild cognitive impairment (MCI) do not always convert to dementia. In such cases, abnormal neuropsychological test results may not validly reflect cognitive symptoms due to brain disease, and the usual brain-behavior relationships may be absent. This study examined symptom validity in a memory clinic sample and its effect on the associations between hippocampal volume and memory performance. Eleven of 170 consecutive patients (6.5%; 13% of patients younger than 65 years) referred to memory clinics showed noncredible performance on symptom validity tests (SVTs, viz. Word Memory Test and Test of Memory Malingering). They were compared to a demographically matched group (n = 57) selected from the remaining patients. Hippocampal volume, measured by an automated volumetric method (Freesurfer), was correlated with scores on six verbal memory tests. The median correlation was r = .49 in the matched group. However, the relation was absent (median r = -.11) in patients who failed SVTs. Memory clinic samples may include patients who show noncredible performance, which invalidates their MCI diagnosis. This underscores the importance of applying SVTs in evaluating patients with cognitive complaints that may signify a predementia stage, especially when these patients are relatively young.

  12. The Need, Development, and Validation of the Innovation Test Instrument

    ERIC Educational Resources Information Center

    Wheadon, Jacob; Wright, Geoff A.; West, Richard E.; Skaggs, Paul

    2017-01-01

    This study discusses the need, development, and validation of the Innovation Test Instrument (ITI). This article outlines how the researchers identified the content domain of the assessment and created test items. Then, it describes initial validation testing of the instrument. The findings suggest that the ITI is a good first step in creating an…

  13. Validation of the Simple Shoulder Test in a Portuguese-Brazilian population. Is the latent variable structure and validation of the Simple Shoulder Test Stable across cultures?

    PubMed

    Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo

    2013-01-01

    The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Factor analysis demonstrated a three factor solution. Cronbach's alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples.

  14. Validation of the Simple Shoulder Test in a Portuguese-Brazilian Population. Is the Latent Variable Structure and Validation of the Simple Shoulder Test Stable across Cultures?

    PubMed Central

    Neto, Jose Osni Bruggemann; Gesser, Rafael Lehmkuhl; Steglich, Valdir; Bonilauri Ferreira, Ana Paula; Gandhi, Mihir; Vissoci, João Ricardo Nickenig; Pietrobon, Ricardo

    2013-01-01

    Background The validation of widely used scales facilitates the comparison across international patient samples. The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Objective The objective of this study was to translate, culturally adapt and validate the Simple Shoulder Test into Brazilian Portuguese. Also we test the stability of factor analysis across different cultures. Methods The Simple Shoulder Test was translated from English into Brazilian Portuguese, translated back into English, and evaluated for accuracy by an expert committee. It was then administered to 100 patients with shoulder conditions. Psychometric properties were analyzed including factor analysis, internal reliability, test-retest reliability at seven days, and construct validity in relation to the Short Form 36 health survey (SF-36). Results Factor analysis demonstrated a three factor solution. Cronbach’s alpha was 0.82. Test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.84. Associations were observed in the hypothesized direction with all subscales of SF-36 questionnaire. Conclusion The Simple Shoulder Test translation and cultural adaptation to Brazilian-Portuguese demonstrated adequate factor structure, internal reliability, and validity, ultimately allowing for its use in the comparison with international patient samples. PMID:23675436

  15. Construction of Valid and Reliable Test for Assessment of Students

    ERIC Educational Resources Information Center

    Osadebe, P. U.

    2015-01-01

    The study was carried out to construct a valid and reliable test in Economics for secondary school students. Two research questions were drawn to guide the establishment of validity and reliability for the Economics Achievement Test (EAT). It is a multiple choice objective test of five options with 100 items. A sample of 1000 students was randomly…

  16. Validity and Reliability of the Arabic Token Test for Children

    ERIC Educational Resources Information Center

    Alkhamra, Rana A.; Al-Jazi, Aya B.

    2016-01-01

    Background: The Token Test for Children (2nd edition) (TTFC) is a measure for assessing receptive language. In this study we describe the translation process, validity and reliability of the Arabic Token Test for Children (A-TTFC). Aims: The aim of this study is to translate, validate and establish the reliability of the Arabic Token Test for…

  17. Conceptualizing Essay Tests' Reliability and Validity: From Research to Theory

    ERIC Educational Resources Information Center

    Badjadi, Nour El Imane

    2013-01-01

    The current paper on writing assessment surveys the literature on the reliability and validity of essay tests. The paper aims to examine the two concepts in relationship with essay testing as well as to provide a snapshot of the current understandings of the reliability and validity of essay tests as drawn in recent research studies. Bearing in…

  18. 10 CFR 26.139 - Reporting initial validity and drug test results.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 10 Energy 1 2011-01-01 2011-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...

  19. 10 CFR 26.139 - Reporting initial validity and drug test results.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 10 Energy 1 2010-01-01 2010-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...

  20. 10 CFR 26.139 - Reporting initial validity and drug test results.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 10 Energy 1 2013-01-01 2013-01-01 false Reporting initial validity and drug test results. 26.139 Section 26.139 Energy NUCLEAR REGULATORY COMMISSION FITNESS FOR DUTY PROGRAMS Licensee Testing Facilities § 26.139 Reporting initial validity and drug test results. (a) The licensee testing facility shall...

  1. Eye-Tracking as a Tool in Process-Oriented Reading Test Validation

    ERIC Educational Resources Information Center

    Solheim, Oddny Judith; Uppstad, Per Henning

    2011-01-01

    The present paper addresses the continuous need for methodological reflection on how to validate inferences made on the basis of test scores. Validation is a process that requires many lines of evidence. In this article we discuss the potential of eye tracking methodology in process-oriented reading test validation. Methodological considerations…

  2. The validation of Huffaz Intelligence Test (HIT)

    NASA Astrophysics Data System (ADS)

    Rahim, Mohd Azrin Mohammad; Ahmad, Tahir; Awang, Siti Rahmah; Safar, Ajmain

    2017-08-01

    In general, a hafiz who can memorize the Quran has many specialties especially in respect to their academic performances. In this study, the theory of multiple intelligences introduced by Howard Gardner is embedded in a developed psychometric instrument, namely Huffaz Intelligence Test (HIT). This paper presents the validation and the reliability of HIT of some tahfiz students in Malaysia Islamic schools. A pilot study was conducted involving 87 huffaz who were randomly selected to answer the items in HIT. The analysis method used includes Partial Least Square (PLS) on reliability, convergence and discriminant validation. The study has validated nine intelligences. The findings also indicated that the composite reliabilities for the nine types of intelligences are greater than 0.8. Thus, the HIT is a valid and reliable instrument to measure the multiple intelligences among huffaz.

  3. Veggie and the VEG-01 Hardware Validation Test

    NASA Technical Reports Server (NTRS)

    Massa, Gioia; wheeler, Ray; Smith, Trent

    2015-01-01

    This presentation presents a brief overview of KSC plant science hardware for space and then details the Veggie hardware and the VEG-01 hardware validation test. The test results and future plans are discussed.

  4. TESTING BALANCE AND FALL RISK IN PERSONS WITH PARKINSON DISEASE, AN ARGUMENT FOR ECOLOGICALLY VALID TESTING

    PubMed Central

    Foreman, K. Bo; Addison, Odessa; Kim, Han S.; Dibble, Leland E.

    2010-01-01

    Introduction Despite clear deficits in postural control, most clinical examination tools lack accuracy in identifying persons with Parkinson disease (PD) who have fallen or are at risk for falls. We assert that this is in part due to the lack of ecological validity of the testing. Methods To test this assertion, we examined the responsiveness and predictive validity of the Functional Gait Assessment (FGA), the Pull test, and the Timed up and Go (TUG) during clinically defined ON and OFF medication states. To address responsiveness, ON/OFF medication performance was compared. To address predictive validity, areas under the curve (AUC) of receiver operating characteristic (ROC) curves were compared. Comparisons were made using separate non-parametric tests. Results Thirty-six persons (24 male, 12 female) with PD (22 fallers, 14 non-fallers) participated. Only the FGA was able to detect differences between fallers and non-fallers for both ON/OFF medication testing. The predictive validity of the FGA and the TUG for fall identification was higher during OFF medication compared to ON medication testing. The predictive validity of the FGA was higher than the TUG and the Pull test during ON and OFF medication testing. Discussion In order to most accurately identify fallers, clinicians should test persons with PD in ecologically relevant conditions and tasks. In this study, interpretation of the OFF medication performance and use of the FGA provided more accurate prediction of those who would fall. PMID:21215674

  5. Face Validity of Test and Acceptance of Generalized Personality Interpretations

    ERIC Educational Resources Information Center

    Delprato, Dennis J.

    1975-01-01

    The degree to which variations in the face validity of psychological tests affected students' willingness to accept personality interpretations was studied. Acceptance of personality interpretations was compared for four types of tests which varied in face validity. The relationship between judged accuracy and rated likability of the…

  6. Does Test Preparation Work? Implications for Score Validity

    ERIC Educational Resources Information Center

    Xie, Qin

    2013-01-01

    This article reports an empirical study that examined the pattern of test preparation for College English Test Band 4 (CET4) and the differential effects of test preparation practices on its scores, thereby drawing implications for CET4 score validity. Data collection involved 1,003 test takers of CET4. A pretest was administered at the beginning…

  7. Independent validation of the MMPI-2-RF Somatic/Cognitive and Validity scales in TBI Litigants tested for effort.

    PubMed

    Youngjohn, James R; Wershba, Rebecca; Stevenson, Matthew; Sturgeon, John; Thomas, Michael L

    2011-04-01

    The MMPI-2 Restructured Form (MMPI-2-RF; Ben-Porath & Tellegen, 2008) is replacing the MMPI-2 as the most widely used personality test in neuropsychological assessment, but additional validation studies are needed. Our study examines MMPI-2-RF Validity scales and the newly created Somatic/Cognitive scales in a recently reported sample of 82 traumatic brain injury (TBI) litigants who either passed or failed effort tests (Thomas & Youngjohn, 2009). The restructured Validity scales FBS-r (restructured symptom validity), F-r (restructured infrequent responses), and the newly created Fs (infrequent somatic responses) were not significant predictors of TBI severity. FBS-r was significantly related to passing or failing effort tests, and Fs and F-r showed non-significant trends in the same direction. Elevations on the Somatic/Cognitive scales profile (MLS-malaise, GIC-gastrointestinal complaints, HPC-head pain complaints, NUC-neurological complaints, and COG-cognitive complaints) were significant predictors of effort test failure. Additionally, HPC had the anticipated paradoxical inverse relationship with head injury severity. The Somatic/Cognitive scales as a group were better predictors of effort test failure than the RF Validity scales, which was an unexpected finding. MLS arose as the single best predictor of effort test failure of all RF Validity and Somatic/Cognitive scales. Item overlap analysis revealed that all MLS items are included in the original MMPI-2 Hy scale, making MLS essentially a subscale of Hy. This study validates the MMPI-2-RF as an effective tool for use in neuropsychological assessment of TBI litigants.

  8. Specifications for a coupled neutronics thermal-hydraulics SFR test case

    NASA Astrophysics Data System (ADS)

    Tassone, A.; Smirnov, A. D.; Tikhomirov, G. V.

    2017-01-01

    Coupling neutronics/thermal-hydraulics calculations for the design of nuclear reactors are a growing trend in the scientific community. This approach allows to properly represent the mutual feedbacks between the neutronic distribution and the thermal-hydraulics properties of the materials composing the reactor, details which are often lost when separate analysis are performed. In this work, a test case for a generation IV sodium-cooled fast reactor (SFR), based on the ASTRID concept developed by CEA, is proposed. Two sub-assemblies (SA) characterized by different fuel enrichment and layout are considered. Specifications for the test case are provided including geometrical data, material compositions, thermo-physical properties and coupling scheme details. Serpent and ANSYS-CFX are used as reference in the description of suitable inputs for the performing of the benchmark, but the use of other code combinations for the purpose of validation of the results is encouraged. The expected outcome of the test case are the axial distribution of volumetric power generation term (q‴), density and temperature for the fuel, the cladding and the coolant.

  9. Validation of Physics Standardized Test Items

    NASA Astrophysics Data System (ADS)

    Marshall, Jill

    2008-10-01

    The Texas Physics Assessment Team (TPAT) examined the Texas Assessment of Knowledge and Skills (TAKS) to determine whether it is a valid indicator of physics preparation for future course work and employment, and of the knowledge and skills needed to act as an informed citizen in a technological society. We categorized science items from the 2003 and 2004 10th and 11th grade TAKS by content area(s) covered, knowledge and skills required to select the correct answer, and overall quality. We also analyzed a 5000 student sample of item-level results from the 2004 11th grade exam using standard statistical methods employed by test developers (factor analysis and Item Response Theory). Triangulation of our results revealed strengths and weaknesses of the different methods of analysis. The TAKS was found to be only weakly indicative of physics preparation and we make recommendations for increasing the validity of standardized physics testing..

  10. Validation of antibiotic residue tests for dairy goats.

    PubMed

    Zeng, S S; Hart, S; Escobar, E N; Tesfai, K

    1998-03-01

    The SNAP test, LacTek test (B-L and CEF), Charm Bacillus sterothermophilus var. calidolactis disk assay (BsDA), and Charm II Tablet Beta-lactam sequential test were validated using antibiotic-fortified and -incurred goat milk following the protocol for test kit validations of the U.S. Food and Drug Administration Center for Veterinary Medicine. SNAP, Charm BsDA, and Charm II Tablet Sequential tests were sensitive and reliable in detecting antibiotic residues in goat milk. All three assays showed greater than 90% sensitivity and specificity at tolerance and detection levels. However, caution should be taken in interpreting test results at detection levels. Because of the high sensitivity of these three tests, false-violative results could be obtained in goat milk containing antibiotic residues below the tolerance level. Goat milk testing positive by these tests must be confirmed using a more sophisticated methodology, such as high-performance liquid chromatography, before the milk is condemned. LacTek B-L test did not detect several antibiotics, including penicillin G, in goat milk at tolerance levels. However, LacTek CEF was excellent in detecting ceftiofur residue in goat milk.

  11. The Validity of IQ Scores Derived from Readiness Screening Tests

    ERIC Educational Resources Information Center

    Telegdy, Gabriel A.

    1976-01-01

    The Screening Test of Academic Readiness (STAR) and the Peabody Picture Vocabulary Test (PPVT) were administered to 52 kindergarten children to reveal the convergent validity of IQ scores derived from the STAR. The findings raise doubts about the validity of the deviation IQs derived from the STAR. (Author)

  12. An exploratory study into the effect of time-restricted internet access on face-validity, construct validity and reliability of postgraduate knowledge progress testing

    PubMed Central

    2013-01-01

    Background Yearly formative knowledge testing (also known as progress testing) was shown to have a limited construct-validity and reliability in postgraduate medical education. One way to improve construct-validity and reliability is to improve the authenticity of a test. As easily accessible internet has become inseparably linked to daily clinical practice, we hypothesized that allowing internet access for a limited amount of time during the progress test would improve the perception of authenticity (face-validity) of the test, which would in turn improve the construct-validity and reliability of postgraduate progress testing. Methods Postgraduate trainees taking the yearly knowledge progress test were asked to participate in a study where they could access the internet for 30 minutes at the end of a traditional pen and paper test. Before and after the test they were asked to complete a short questionnaire regarding the face-validity of the test. Results Mean test scores increased significantly for all training years. Trainees indicated that the face-validity of the test improved with internet access and that they would like to continue to have internet access during future testing. Internet access did not improve the construct-validity or reliability of the test. Conclusion Improving the face-validity of postgraduate progress testing, by adding the possibility to search the internet for a limited amount of time, positively influences test performance and face-validity. However, it did not change the reliability or the construct-validity of the test. PMID:24195696

  13. Validation of Helicopter Gear Condition Indicators Using Seeded Fault Tests

    NASA Technical Reports Server (NTRS)

    Dempsey, Paula; Brandon, E. Bruce

    2013-01-01

    A "seeded fault test" in support of a rotorcraft condition based maintenance program (CBM), is an experiment in which a component is tested with a known fault while health monitoring data is collected. These tests are performed at operating conditions comparable to operating conditions the component would be exposed to while installed on the aircraft. Performance of seeded fault tests is one method used to provide evidence that a Health Usage Monitoring System (HUMS) can replace current maintenance practices required for aircraft airworthiness. Actual in-service experience of the HUMS detecting a component fault is another validation method. This paper will discuss a hybrid validation approach that combines in service-data with seeded fault tests. For this approach, existing in-service HUMS flight data from a naturally occurring component fault will be used to define a component seeded fault test. An example, using spiral bevel gears as the targeted component, will be presented. Since the U.S. Army has begun to develop standards for using seeded fault tests for HUMS validation, the hybrid approach will be mapped to the steps defined within their Aeronautical Design Standard Handbook for CBM. This paper will step through their defined processes, and identify additional steps that may be required when using component test rig fault tests to demonstrate helicopter CI performance. The discussion within this paper will provide the reader with a better appreciation for the challenges faced when defining a seeded fault test for HUMS validation.

  14. A Historical Overview on the Concept of Validity in Language Testing

    ERIC Educational Resources Information Center

    Hamavandy, Mehraban; Kiany, Gholam Reza

    2014-01-01

    This article provides an overview on language test validation theories, especially the Messickian view on construct validity and the way it's been translated into practice. First, a brief historical synopsis will be set forth, followed by recent views on test validity as advanced by Messick and Kane. The review goes on to lay out the similarities…

  15. Shifting the Focus of Validity for Test Use

    ERIC Educational Resources Information Center

    Moss, Pamela A.

    2016-01-01

    The conventional focus of validity in educational measurement has been on intended interpretations and uses of test scores. Empirical studies of test use by teachers, administrators and policy-makers show that actual interpretations and uses of test scores in context are invariably shaped by local users' questions, which frequently require…

  16. The validity of three tests of temperament in guppies (Poecilia reticulata).

    PubMed

    Burns, James G

    2008-11-01

    Differences in temperament (consistent differences among individuals in behavior) can have important effects on fitness-related activities such as dispersal and competition. However, evolutionary ecologists have put limited effort into validating their tests of temperament. This article attempts to validate three standard tests of temperament in guppies: the open-field test, emergence test, and novel-object test. Through multiple reliability trials, and comparison of results between different types of test, this study establishes the confidence that can be placed in these temperament tests. The open-field test is shown to be a good test of boldness and exploratory behavior; the open-field test was reliable when tested in multiple ways. There were problems with the emergence test and novel-object test, which leads one to conclude that the protocols used in this study should not be considered valid tests for this species. (PsycINFO Database Record (c) 2008 APA, all rights reserved).

  17. Assessing Discriminative Performance at External Validation of Clinical Prediction Models

    PubMed Central

    Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W.

    2016-01-01

    Introduction External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. Methods We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. Results The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. Conclusion The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect

  18. Assessing Discriminative Performance at External Validation of Clinical Prediction Models.

    PubMed

    Nieboer, Daan; van der Ploeg, Tjeerd; Steyerberg, Ewout W

    2016-01-01

    External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting. We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury. The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2. The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.

  19. The Selective Arterial Calcium Injection Test is a Valid Diagnostic Method for Invisible Gastrinoma with Duodenal Ulcer Stenosis: A Case Report.

    PubMed

    Okada, Kenjiro; Sudo, Takeshi; Miyamoto, Katsunari; Yokoyama, Yujiro; Sakashita, Yoshihiro; Hashimoto, Yasushi; Kobayashi, Hironori; Otsuka, Hiroyuki; Sakoda, Takuya; Shimamoto, Fumio

    2016-03-01

    The localization and diagnosis of microgastrinomas in a patient with multiple endocrine neoplasia type 1 is difficult preoperatively. The selective arterial calcium injection (SACI) test is a valid diagnostic method for the preoperative diagnosis of these invisible microgastrinomas. We report a rare case of multiple invisible duodenal microgastrinomas with severe duodenal stenosis diagnosed preoperatively by using the SACI test. A 50-year-old man was admitted to our hospital with recurrent duodenal ulcers. His serum gastrin level was elevated to 730 pg/ml. It was impossible for gastrointestinal endoscopy to pass through to visualize the inferior part of the duodenum, because recurrent duodenal ulcers had resulted in severe duodenal stenosis. The duodenal stenosis also prevented additional endoscopic examinations such as endoscopic ultrasonography. Computed tomography did not show any tumors in the duodenum and pancreas. The SACI test provided the evidence for a gastrinoma in the vascular territory of the inferior pancreatic-duodenal artery. We diagnosed a gastrinoma in the peri- ampullary lesion, so we performed Subtotal Stomach-Preserving Pancreatico- duodenectomy with regional lymphadenectomy. Histopathological findings showed multiple duodenal gastrinomas with lymph node metastasis and nonfunctioning pancreatic neuroendocrine tumors. Twenty months after surgery, the patient is alive with no evidence of recurrence and a normal gastrin level. In conclusion, the SACI test can enhance the accuracy of preoperative localization and diagnosis of invisible microgastrinomas, especially in the setting of severe duodenal stenosis.

  20. The Air Force Officer Qualifying Test: Validity, Fairness, and Bias

    DTIC Science & Technology

    2010-01-01

    scores. The Standards for Educational and Psychological Testing (AERA, APA, and NCME, 1999) provides a set of guidelines published and endorsed by the...determining the validity and bias of selection tests falls upon professionals in the discipline of industrial/organizational psychology 20 See Roper v. Dep’t...i). 30 The Air Force Officer Qualifying Test : Validity, Fairness, and Bias and closely related fields (e.g., educational psychology and

  1. Automated Generation and Assessment of Autonomous Systems Test Cases

    NASA Technical Reports Server (NTRS)

    Barltrop, Kevin J.; Friberg, Kenneth H.; Horvath, Gregory A.

    2008-01-01

    This slide presentation reviews some of the issues concerning verification and validation testing of autonomous spacecraft routinely culminates in the exploration of anomalous or faulted mission-like scenarios using the work involved during the Dawn mission's tests as examples. Prioritizing which scenarios to develop usually comes down to focusing on the most vulnerable areas and ensuring the best return on investment of test time. Rules-of-thumb strategies often come into play, such as injecting applicable anomalies prior to, during, and after system state changes; or, creating cases that ensure good safety-net algorithm coverage. Although experience and judgment in test selection can lead to high levels of confidence about the majority of a system's autonomy, it's likely that important test cases are overlooked. One method to fill in potential test coverage gaps is to automatically generate and execute test cases using algorithms that ensure desirable properties about the coverage. For example, generate cases for all possible fault monitors, and across all state change boundaries. Of course, the scope of coverage is determined by the test environment capabilities, where a faster-than-real-time, high-fidelity, software-only simulation would allow the broadest coverage. Even real-time systems that can be replicated and run in parallel, and that have reliable set-up and operations features provide an excellent resource for automated testing. Making detailed predictions for the outcome of such tests can be difficult, and when algorithmic means are employed to produce hundreds or even thousands of cases, generating predicts individually is impractical, and generating predicts with tools requires executable models of the design and environment that themselves require a complete test program. Therefore, evaluating the results of large number of mission scenario tests poses special challenges. A good approach to address this problem is to automatically score the results

  2. Testing expert systems

    NASA Technical Reports Server (NTRS)

    Chang, C. L.; Stachowitz, R. A.

    1988-01-01

    Software quality is of primary concern in all large-scale expert system development efforts. Building appropriate validation and test tools for ensuring software reliability of expert systems is therefore required. The Expert Systems Validation Associate (EVA) is a validation system under development at the Lockheed Artificial Intelligence Center. EVA provides a wide range of validation and test tools to check correctness, consistency, and completeness of an expert system. Testing a major function of EVA. It means executing an expert system with test cases with the intent of finding errors. In this paper, we describe many different types of testing such as function-based testing, structure-based testing, and data-based testing. We describe how appropriate test cases may be selected in order to perform good and thorough testing of an expert system.

  3. How to test validity in orthodontic research: a mixed dentition analysis example.

    PubMed

    Donatelli, Richard E; Lee, Shin-Jae

    2015-02-01

    The data used to test the validity of a prediction method should be different from the data used to generate the prediction model. In this study, we explored whether an independent data set is mandatory for testing the validity of a new prediction method and how validity can be tested without independent new data. Several validation methods were compared in an example using the data from a mixed dentition analysis with a regression model. The validation errors of real mixed dentition analysis data and simulation data were analyzed for increasingly large data sets. The validation results of both the real and the simulation studies demonstrated that the leave-1-out cross-validation method had the smallest errors. The largest errors occurred in the traditional simple validation method. The differences between the validation methods diminished as the sample size increased. The leave-1-out cross-validation method seems to be an optimal validation method for improving the prediction accuracy in a data set with limited sample sizes. Copyright © 2015 American Association of Orthodontists. Published by Elsevier Inc. All rights reserved.

  4. ASTM Validates Air Pollution Test Methods

    ERIC Educational Resources Information Center

    Chemical and Engineering News, 1973

    1973-01-01

    The American Society for Testing and Materials (ASTM) has validated six basic methods for measuring pollutants in ambient air as the first part of its Project Threshold. Aim of the project is to establish nationwide consistency in measuring pollutants; determining precision, accuracy and reproducibility of 35 standard measuring methods. (BL)

  5. An Integrated Approach to Establish Validity and Reliability of Reading Tests

    ERIC Educational Resources Information Center

    Razi, Salim

    2012-01-01

    This study presents the processes of developing and establishing reliability and validity of a reading test by administering an integrative approach as conventional reliability and validity measures superficially reveals the difficulty of a reading test. In this respect, analysing vocabulary frequency of the test is regarded as a more eligible way…

  6. Case Study: Testing with Case Studies

    ERIC Educational Resources Information Center

    Herreid, Clyde Freeman

    2015-01-01

    This column provides original articles on innovations in case study teaching, assessment of the method, as well as case studies with teaching notes. This month's issue discusses using case studies to test for knowledge or lessons learned.

  7. The CPT Reading Comprehension Test: A Validity Study.

    ERIC Educational Resources Information Center

    Napoli, Anthony R.; Raymond, Lanette A.; Coffey, Cheryl A.; Bosco, Diane M.

    1998-01-01

    Describes a study done at Suffolk County Community College (New York) that assessed the validity of the College Board's Computerized Placement Test in Reading Comprehension (CPT-R) by comparing test results of 1,154 freshmen with the results of the Degree of Power Reading Test. Results confirmed the CPT-R's reliability in identifying basic…

  8. Validity and Reliability of a Medicine Ball Explosive Power Test.

    ERIC Educational Resources Information Center

    Stockbrugger, Barry A.; Haennel, Robert G.

    2001-01-01

    Evaluated the validity and reliability of a medicine ball throw test to evaluate explosive power. Data on competitive sand volleyball players who performed a medicine ball throw and a standard countermovement jump indicated that the medicine ball throw test was a valid and reliable way to assess explosive power for an analogous total-body movement…

  9. Is the Simple Shoulder Test a valid outcome instrument for shoulder arthroplasty?

    PubMed

    Hsu, Jason E; Russ, Stacy M; Somerson, Jeremy S; Tang, Anna; Warme, Winston J; Matsen, Frederick A

    2017-10-01

    The Simple Shoulder Test (SST) is a brief, inexpensive, and widely used patient-reported outcome tool, but it has not been rigorously evaluated for patients having shoulder arthroplasty. The goal of this study was to rigorously evaluate the validity of the SST for outcome assessment in shoulder arthroplasty using a systematic review of the literature and an analysis of its properties in a series of 408 surgical cases. SST scores, 36-Item Short Form Health Survey scores, and satisfaction scores were collected preoperatively and 2 years postoperatively. Responsiveness was assessed by comparing preoperative and 2-year postoperative scores. Criterion validity was determined by correlating the SST with the 36-Item Short Form Health Survey. Construct validity was tested through 5 clinical hypotheses regarding satisfaction, comorbidities, insurance status, previous failed surgery, and narcotic use. Scores after arthroplasty improved from 3.9 ± 2.8 to 10.2 ± 2.3 (P < .001). The change in SST correlated strongly with patient satisfaction (P < .001). The SST had large Cohen's d effect sizes and standardized response means. Criterion validity was supported by significant differences between satisfied and unsatisfied patients, those with more severe and less severe comorbidities, those with workers' compensation or Medicaid and other types of insurance, those with and without previous failed shoulder surgery, and those taking and those not taking narcotic pain medication before surgery (P < .005). These data combined with a systematic review of the literature demonstrate that the SST is a valid and responsive patient-reported outcome measure for assessing the outcomes of shoulder arthroplasty. Copyright © 2017 Journal of Shoulder and Elbow Surgery Board of Trustees. Published by Elsevier Inc. All rights reserved.

  10. A validated case definition for chronic rhinosinusitis in administrative data: a Canadian perspective.

    PubMed

    Rudmik, Luke; Xu, Yuan; Kukec, Edward; Liu, Mingfu; Dean, Stafford; Quan, Hude

    2016-11-01

    Pharmacoepidemiological research using administrative databases has become increasingly popular for chronic rhinosinusitis (CRS); however, without a validated case definition the cohort evaluated may be inaccurate resulting in biased and incorrect outcomes. The objective of this study was to develop and validate a generalizable administrative database case definition for CRS using International Classification of Diseases, 9th edition (ICD-9)-coded claims. A random sample of 100 patients with a guideline-based diagnosis of CRS and 100 control patients were selected and then linked to a Canadian physician claims database from March 31, 2010, to March 31, 2015. The proportion of CRS ICD-9-coded claims (473.x and 471.x) for each of these 200 patients were reviewed and the validity of 7 different ICD-9-based coding algorithms was evaluated. The CRS case definition of ≥2 claims with a CRS ICD-9 code (471.x or 473.x) within 2 years of the reference case provides a balanced validity with a sensitivity of 77% and specificity of 79%. Applying this CRS case definition to the claims database produced a CRS cohort of 51,000 patients with characteristics that were consistent with published demographics and rates of comorbid asthma, allergic rhinitis, and depression. This study has validated several coding algorithms; based on the results a case definition of ≥2 physician claims of CRS (ICD-9 of 471.x or 473.x) within 2 years provides an optimal level of validity. Future studies will need to validate this administrative case definition from different health system perspectives and using larger retrospective chart reviews from multiple providers. © 2016 ARS-AAOA, LLC.

  11. Solar Sail Models and Test Measurements Correspondence for Validation Requirements Definition

    NASA Technical Reports Server (NTRS)

    Ewing, Anthony; Adams, Charles

    2004-01-01

    Solar sails are being developed as a mission-enabling technology in support of future NASA science missions. Current efforts have advanced solar sail technology sufficient to justify a flight validation program. A primary objective of this activity is to test and validate solar sail models that are currently under development so that they may be used with confidence in future science mission development (e.g., scalable to larger sails). Both system and model validation requirements must be defined early in the program to guide design cycles and to ensure that relevant and sufficient test data will be obtained to conduct model validation to the level required. A process of model identification, model input/output documentation, model sensitivity analyses, and test measurement correspondence is required so that decisions can be made to satisfy validation requirements within program constraints.

  12. Construction and Evaluation of Reliability and Validity of Reasoning Ability Test

    ERIC Educational Resources Information Center

    Bhat, Mehraj A.

    2014-01-01

    This paper is based on the construction and evaluation of reliability and validity of reasoning ability test at secondary school students. In this paper an attempt was made to evaluate validity, reliability and to determine the appropriate standards to interpret the results of reasoning ability test. The test includes 45 items to measure six types…

  13. Test Anxiety and the Validity of Cognitive Tests: A Confirmatory Factor Analysis Perspective and Some Empirical Findings

    ERIC Educational Resources Information Center

    Wicherts, Jelte M.; Scholten, Annemarie Zand

    2010-01-01

    The validity of cognitive ability tests is often interpreted solely as a function of the cognitive abilities that these tests are supposed to measure, but other factors may be at play. The effects of test anxiety on the criterion related validity (CRV) of tests was the topic of a recent study by Reeve, Heggestad, and Lievens (2009) (Reeve, C. L.,…

  14. 40 CFR 610.24 - Validity of test data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 30 2011-07-01 2011-07-01 false Validity of test data. 610.24 Section 610.24 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY RETROFIT DEVICES Test Procedures and Evaluation Criteria Evaluation Criteria for the Preliminary...

  15. 40 CFR 610.24 - Validity of test data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 31 2012-07-01 2012-07-01 false Validity of test data. 610.24 Section 610.24 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY RETROFIT DEVICES Test Procedures and Evaluation Criteria Evaluation Criteria for the Preliminary...

  16. 40 CFR 610.24 - Validity of test data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 30 2014-07-01 2014-07-01 false Validity of test data. 610.24 Section 610.24 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY RETROFIT DEVICES Test Procedures and Evaluation Criteria Evaluation Criteria for the Preliminary...

  17. 40 CFR 610.24 - Validity of test data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 31 2013-07-01 2013-07-01 false Validity of test data. 610.24 Section 610.24 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY RETROFIT DEVICES Test Procedures and Evaluation Criteria Evaluation Criteria for the Preliminary...

  18. 40 CFR 610.24 - Validity of test data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 29 2010-07-01 2010-07-01 false Validity of test data. 610.24 Section 610.24 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) ENERGY POLICY FUEL ECONOMY RETROFIT DEVICES Test Procedures and Evaluation Criteria Evaluation Criteria for the Preliminary...

  19. Test validity and performance validity: considerations in providing a framework for development of an ability-focused neuropsychological test battery.

    PubMed

    Larrabee, Glenn J

    2014-11-01

    Literature on test validity and performance validity is reviewed to propose a framework for specification of an ability-focused battery (AFB). Factor analysis supports six domains of ability: first, verbal symbolic; secondly, visuoperceptual and visuospatial judgment and problem solving; thirdly, sensorimotor skills; fourthly, attention/working memory; fifthly, processing speed; finally, learning and memory (which can be divided into verbal and visual subdomains). The AFB should include at least three measures for each of the six domains, selected based on various criteria for validity including sensitivity to presence of disorder, sensitivity to severity of disorder, correlation with important activities of daily living, and containing embedded/derived measures of performance validity. Criterion groups should include moderate and severe traumatic brain injury, and Alzheimer's disease. Validation groups should also include patients with left and right hemisphere stroke, to determine measures sensitive to lateralized cognitive impairment and so that the moderating effects of auditory comprehension impairment and neglect can be analyzed on AFB measures. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. The Chinese version of the Child and Adolescent Scale of Environment (CASE-C): validity and reliability for children with disabilities in Taiwan.

    PubMed

    Kang, Lin-Ju; Yen, Chia-Feng; Bedell, Gary; Simeonsson, Rune J; Liou, Tsan-Hon; Chi, Wen-Chou; Liu, Shu-Wen; Liao, Hua-Fang; Hwang, Ai-Wen

    2015-03-01

    Measurement of children's participation and environmental factors is a key component of the assessment in the new Disability Evaluation System (DES) in Taiwan. The Child and Adolescent Scale of Environment (CASE) was translated into Traditional Chinese (CASE-C) and used for assessing environmental factors affecting the participation of children and youth with disabilities in the DES. The aim of this study was to validate the CASE-C. Participants were 614 children and youth aged 6.0-17.9 years with disabilities, with the largest condition group comprised of children with intellectual disability (61%). Internal structure, internal consistency, test-retest reliability, convergent validity, and discriminant (known group) validity were examined using exploratory factor analyses, Cronbach's α coefficient, intra-class correlation coefficients (ICC), correlation analyses, and univariate ANOVAs. A three-factor structure (Family/Community Resources, Assistance/Attitude Supports, and Physical Design Access) of the CASE-C was produced with 38% variance explained. The CASE-C had adequate internal consistency (Cronbach's α=.74-.86) and test-retest reliability (ICCs=.73-.90). Children and youth with disabilities who had higher levels of severity of impairment encountered more environmental barriers and those experiencing more environmental problems also had greater restrictions in participation. The CASE-C scores were found to distinguish children on the basis of disability condition and impairment severity, but not on the basis of age or sex. The CASE-C is valid for assessing environmental problems experienced by children and youth with disabilities in Taiwan. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. Validation through Understanding Test-Taking Strategies: An Illustration With the CELPIP-General Reading Pilot Test Using Structural Equation Modeling

    ERIC Educational Resources Information Center

    Wu, Amery D.; Stone, Jake E.

    2016-01-01

    This article explores an approach for test score validation that examines test takers' strategies for taking a reading comprehension test. The authors formulated three working hypotheses about score validity pertaining to three types of test-taking strategy (comprehending meaning, test management, and test-wiseness). These hypotheses were…

  2. The Anomalous Sentences Repetition Test: Replication and Validation Study.

    ERIC Educational Resources Information Center

    Weeks, David J.

    1986-01-01

    Presents a brief clinical test, derived from earlier neuropsychological instruments, with evidence for its reliability, interscorer agreement, and validity. The latter is based upon correlations with both CAT scan measures of cortical atrophy and ventricular enlargement, as well as correlations with seven other previously validated cognitive…

  3. Reliability and Validity of the Inline Skating Skill Test

    PubMed Central

    Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje

    2016-01-01

    This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8–2.6%] – 2.2% [95% CI: 0.0–4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2–2.4%] – 2.7% [95% CI: 2.1–4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92–0.99] – 0.99 [95% CI: 0.98–1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters’ performances. Competitive-level skaters needed shorter time (24.4–26.4%, all p < 0.01) to complete the test in comparison to recreational-level skaters. Moreover, moderate correlation (ρ = 0.80–0.82; all p < 0.01) was observed between the participant’s self-rating and achieved performance times. In conclusion, the proposed test is a reliable and valid method to evaluate inline skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters. Key points Study evaluated the reliability and construct validity of a newly developed inline skating skill test. Evaluated test is a first protocol designed to assess specific inline skating skill. Two groups of amateur skaters with

  4. Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review.

    PubMed

    Greher, Michael R; Wodushek, Thomas R

    2017-03-01

    Performance validity testing refers to neuropsychologists' methodology for determining whether neuropsychological test performances completed in the course of an evaluation are valid (ie, the results of true neurocognitive function) or invalid (ie, overly impacted by the patient's effort/engagement in testing). This determination relies upon the use of either standalone tests designed for this sole purpose, or specific scores/indicators embedded within traditional neuropsychological measures that have demonstrated this utility. In response to a greater appreciation for the critical role that performance validity issues play in neuropsychological testing and the need to measure this variable to the best of our ability, the scientific base for performance validity testing has expanded greatly over the last 20 to 30 years. As such, the majority of current day neuropsychologists in the United States use a variety of measures for the purpose of performance validity testing as part of everyday forensic and clinical practice and address this issue directly in their evaluations. The following is the first article of a 2-part series that will address the evolution of performance validity testing in the field of neuropsychology, both in terms of the science as well as the clinical application of this measurement technique. The second article of this series will review performance validity tests in terms of methods for development of these measures, and maximizing of diagnostic accuracy.

  5. Validity of the Eating Attitude Test among Exercisers.

    PubMed

    Lane, Helen J; Lane, Andrew M; Matheson, Hilary

    2004-12-01

    Theory testing and construct measurement are inextricably linked. To date, no published research has looked at the factorial validity of an existing eating attitude inventory for use with exercisers. The Eating Attitude Test (EAT) is a 26-item measure that yields a single index of disordered eating attitudes. The original factor analysis showed three interrelated factors: Dieting behavior (13-items), oral control (7-items), and bulimia nervosa-food preoccupation (6-items). The primary purpose of the study was to examine the factorial validity of the EAT among a sample of exercisers. The second purpose was to investigate relationships between eating attitudes scores and selected psychological constructs. In stage one, 598 regular exercisers completed the EAT. Confirmatory factor analysis (CFA) was used to test the single-factor, a three-factor model, and a four-factor model, which distinguished bulimia from food pre-occupation. CFA of the single-factor model (RCFI = 0.66, RMSEA = 0.10), the three-factor-model (RCFI = 0.74; RMSEA = 0.09) showed poor model fit. There was marginal fit for the 4-factor model (RCFI = 0.91, RMSEA = 0.06). Results indicated five-items showed poor factor loadings. After these 5-items were discarded, the three models were re-analyzed. CFA results indicated that the single-factor model (RCFI = 0.76, RMSEA = 0.10) and three-factor model (RCFI = 0.82, RMSEA = 0.08) showed poor fit. CFA results for the four-factor model showed acceptable fit indices (RCFI = 0.98, RMSEA = 0.06). Stage two explored relationships between EAT scores, mood, self-esteem, and motivational indices toward exercise in terms of self-determination, enjoyment and competence. Correlation results indicated that depressed mood scores positively correlated with bulimia and dieting scores. Further, dieting was inversely related with self-determination toward exercising. Collectively, findings suggest that a 21-item four-factor model shows promising validity coefficients among

  6. K(3)EDTA Vacuum Tubes Validation for Routine Hematological Testing.

    PubMed

    Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare

    2012-01-01

    Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K(3)EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K(3)EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests.

  7. Commentary on "Validating the Interpretations and Uses of Test Scores"

    ERIC Educational Resources Information Center

    Brennan, Robert L.

    2013-01-01

    Kane's paper "Validating the Interpretations and Uses of Test Scores" is the most complete and clearest discussion yet available of the argument-based approach to validation. At its most basic level, validation as formulated by Kane is fundamentally a simply-stated two-step enterprise: (1) specify the claims inherent in a particular interpretation…

  8. Testing-Based Compiler Validation for Synchronous Languages

    NASA Technical Reports Server (NTRS)

    Garoche, Pierre-Loic; Howar, Falk; Kahsai, Temesghen; Thirioux, Xavier

    2014-01-01

    In this paper we present a novel lightweight approach to validate compilers for synchronous languages. Instead of verifying a compiler for all input programs or providing a fixed suite of regression tests, we extend the compiler to generate a test-suite with high behavioral coverage and geared towards discovery of faults for every compiled artifact. We have implemented and evaluated our approach using a compiler from Lustre to C.

  9. Validation of the Information/Communications Technology Literacy Test

    DTIC Science & Technology

    2016-10-01

    nested set. Table 11 presents the results of incremental validity analyses for job knowledge/performance criteria by MOS. Figure 7 presents much...Systems Operator-Analyst (25B) and Nodal Network Systems Operator-Maintainer (25N) MOS. This report documents technical procedures and results of the...research effort. Results suggest that the ICTL test has potential as a valid and highly efficient predictor of valued outcomes in Signal school MOS. Not

  10. Validation of Metagenomic Next-Generation Sequencing Tests for Universal Pathogen Detection.

    PubMed

    Schlaberg, Robert; Chiu, Charles Y; Miller, Steve; Procop, Gary W; Weinstock, George

    2017-06-01

    - Metagenomic sequencing can be used for detection of any pathogens using unbiased, shotgun next-generation sequencing (NGS), without the need for sequence-specific amplification. Proof-of-concept has been demonstrated in infectious disease outbreaks of unknown causes and in patients with suspected infections but negative results for conventional tests. Metagenomic NGS tests hold great promise to improve infectious disease diagnostics, especially in immunocompromised and critically ill patients. - To discuss challenges and provide example solutions for validating metagenomic pathogen detection tests in clinical laboratories. A summary of current regulatory requirements, largely based on prior guidance for NGS testing in constitutional genetics and oncology, is provided. - Examples from 2 separate validation studies are provided for steps from assay design, and validation of wet bench and bioinformatics protocols, to quality control and assurance. - Although laboratory and data analysis workflows are still complex, metagenomic NGS tests for infectious diseases are increasingly being validated in clinical laboratories. Many parallels exist to NGS tests in other fields. Nevertheless, specimen preparation, rapidly evolving data analysis algorithms, and incomplete reference sequence databases are idiosyncratic to the field of microbiology and often overlooked.

  11. Development and Validation of a Test for Bulimia.

    ERIC Educational Resources Information Center

    Smith, Marcia C.; Thelen, Mark H.

    1984-01-01

    Developed the Bulimia Test (BULIT) based on responses of clinically identified females (N=18) and normal female college students (N=119) to preliminary test items. Results showed that the BULIT provided an objective, reliable, and valid measure by which to identify individuals with symptoms of bulimia. (Instrument is appended.) (LLL)

  12. FAST Model Calibration and Validation of the OC5-DeepCwind Floating Offshore Wind System Against Wave Tank Test Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wendt, Fabian F; Robertson, Amy N; Jonkman, Jason

    During the course of the Offshore Code Comparison Collaboration, Continued, with Correlation (OC5) project, which focused on the validation of numerical methods through comparison against tank test data, the authors created a numerical FAST model of the 1:50-scale DeepCwind semisubmersible system that was tested at the Maritime Research Institute Netherlands ocean basin in 2013. This paper discusses several model calibration studies that were conducted to identify model adjustments that improve the agreement between the numerical simulations and the experimental test data. These calibration studies cover wind-field-specific parameters (coherence, turbulence), hydrodynamic and aerodynamic modeling approaches, as well as rotor model (blade-pitchmore » and blade-mass imbalances) and tower model (structural tower damping coefficient) adjustments. These calibration studies were conducted based on relatively simple calibration load cases (wave only/wind only). The agreement between the final FAST model and experimental measurements is then assessed based on more-complex combined wind and wave validation cases.« less

  13. Strategies for Validation Testing of Ground Systems

    NASA Technical Reports Server (NTRS)

    Annis, Tammy; Sowards, Stephanie

    2009-01-01

    In order to accomplish the full Vision for Space Exploration announced by former President George W. Bush in 2004, NASA will have to develop a new space transportation system and supporting infrastructure. The main portion of this supporting infrastructure will reside at the Kennedy Space Center (KSC) in Florida and will either be newly developed or a modification of existing vehicle processing and launch facilities, including Ground Support Equipment (GSE). This type of large-scale launch site development is unprecedented since the time of the Apollo Program. In order to accomplish this successfully within the limited budget and schedule constraints a combination of traditional and innovative strategies for Verification and Validation (V&V) have been developed. The core of these strategies consists of a building-block approach to V&V, starting with component V&V and ending with a comprehensive end-to-end validation test of the complete launch site, called a Ground Element Integration Test (GEIT). This paper will outline these strategies and provide the high level planning for meeting the challenges of implementing V&V on a large-scale development program. KEY WORDS: Systems, Elements, Subsystem, Integration Test, Ground Systems, Ground Support Equipment, Component, End Item, Test and Verification Requirements (TVR), Verification Requirements (VR)

  14. Automated Vision Test Development and Validation

    DTIC Science & Technology

    2016-11-01

    Deputy Chief, Aerosp Med Consultation Div Chair, Aerospace Medicine Department This report is published in the interest of...produce software for desktop displays; and to evaluate features such as user interfaces, threshold algorithms, validity of results, and screening...cost of performing full threshold testing on over 30% of normal subjects, which is quite time consuming. This effort was accomplished using desktop

  15. Validation of the 3-day rule for stool bacterial tests in Japan.

    PubMed

    Kobayashi, Masanori; Sako, Akahito; Ogami, Toshiko; Nishimura, So; Asayama, Naoki; Yada, Tomoyuki; Nagata, Naoyoshi; Sakurai, Toshiyuki; Yokoi, Chizu; Kobayakawa, Masao; Yanase, Mikio; Masaki, Naohiko; Takeshita, Nozomi; Uemura, Naomi

    2014-01-01

    Stool cultures are expensive and time consuming, and the positive rate of enteric pathogens in cases of nosocomial diarrhea is low. The 3-day rule, whereby clinicians order a Clostridium difficile (CD) toxin test rather than a stool culture for inpatients developing diarrhea >3 days after admission, has been well studied in Western countries. The present study sought to validate the 3-day rule in an acute care hospital setting in Japan. Stool bacterial and CD toxin test results for adult patients hospitalized in an acute care hospital in 2008 were retrospectively analyzed. Specimens collected after an initial positive test were excluded. The positive rate and cost-effectiveness of the tests were compared among three patient groups. The adult patients were divided into three groups for comparison: outpatients, patients hospitalized for ≤3 days and patients hospitalized for ≥4 days. Over the 12-month period, 1,597 stool cultures were obtained from 992 patients, and 880 CD toxin tests were performed in 529 patients. In the outpatient, inpatient ≤3 days and inpatient ≥4 days groups, the rate of positive stool cultures was 14.2%, 3.6% and 1.3% and that of positive CD toxin tests was 1.9%, 7.1% and 8.5%, respectively. The medical costs required to obtain one positive result were 9,181, 36,075 and 103,600 JPY and 43,200, 11,333 and 9,410 JPY, respectively. The 3-day rule was validated for the first time in a setting other than a Western country. Our results revealed that the "3-day rule" is also useful and cost-effective in Japan.

  16. Validation of the Vanderbilt Holistic Face Processing Test.

    PubMed

    Wang, Chao-Chih; Ross, David A; Gauthier, Isabel; Richler, Jennifer J

    2016-01-01

    The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1.

  17. Validation of the Vanderbilt Holistic Face Processing Test

    PubMed Central

    Wang, Chao-Chih; Ross, David A.; Gauthier, Isabel; Richler, Jennifer J.

    2016-01-01

    The Vanderbilt Holistic Face Processing Test (VHPT-F) is a new measure of holistic face processing with better psychometric properties relative to prior measures developed for group studies (Richler et al., 2014). In fields where psychologists study individual differences, validation studies are commonplace and the concurrent validity of a new measure is established by comparing it to an older measure with established validity. We follow this approach and test whether the VHPT-F measures the same construct as the composite task, which is group-based measure at the center of the large literature on holistic face processing. In Experiment 1, we found a significant correlation between holistic processing measured in the VHPT-F and the composite task. Although this correlation was small, it was comparable to the correlation between holistic processing measured in the composite task with the same faces, but different target parts (top or bottom), which represents a reasonable upper limit for correlations between the composite task and another measure of holistic processing. These results confirm the validity of the VHPT-F by demonstrating shared variance with another measure of holistic processing based on the same operational definition. These results were replicated in Experiment 2, but only when the demographic profile of our sample matched that of Experiment 1. PMID:27933014

  18. Reliability and validity of the closed kinetic chain upper extremity stability test.

    PubMed

    Lee, Dong-Rour; Kim, Laurentius Jongsoon

    2015-04-01

    [Purpose] The purpose of this study was to examine the reliability and validity of the Closed Kinetic Chain Upper Extremity Stability (CKCUES) test. [Subjects and Methods] A sample of 40 subjects (20 males, 20 females) with and without pain in the upper limbs was recruited. The subjects were tested twice, three days apart to assess the reliability of the CKCUES test. The CKCUES test was performed four times, and the average was calculated using the data of the last 3 tests. In order to test the validity of the CKCUES test, peak torque of internal/external shoulder rotation was measured using an isokinetic dynamometer, and maximum grip strength was measured using a hand dynamometer, and their Pearson correlation coefficients with the average values of the CKCUES test were calculated. [Results] The reliability of the CKCUES test was very high (ICC=0.97). The correlations between the CKCUES test and maximum grip strength (r=0.78-0.79), and the peak torque of internal/external shoulder rotation (r=0.87-0.94) were high indicating its validity. [Conclusion] The reliability and validity of the CKCUES test were high. The CKCUES test is expected to be used for clinical tests on upper limb stability at low price.

  19. Validity Tests of the Adolescent Domain Screening Inventory (ADSI) with Older Adolescents

    ERIC Educational Resources Information Center

    Corrigan, Matthew J.; Forte, James; Bulgaris, Sarah

    2017-01-01

    The purpose of this replication study is to test the validity of the Adolescent Domain Screening Inventory (ADSI) on an older adolescent population. This cross sectional study used a convenience sample to preliminarily test the validity of the ADSI. Concurrent validity correlations ranged from a high of 0.924 to a low of 0.760. The known…

  20. Validity Theory: Reform Policies, Accountability Testing, and Consequences

    ERIC Educational Resources Information Center

    Chalhoub-Deville, Micheline

    2016-01-01

    Educational policies such as Race to the Top in the USA affirm a central role for testing systems in government-driven reform efforts. Such reform policies are often referred to as the global education reform movement (GERM). Changes observed with the GERM style of testing demand socially engaged validity theories that include consequential…

  1. Validity of Integrity Tests for Predicting Drug and Alcohol Abuse

    DTIC Science & Technology

    1993-08-31

    Wiinkler and Sheridan (1989) found that employees who entered employee assistance programs for treating drug addiction were more likely be absent...August 31, 1993 Final 4. TITLE AND SUBTITLE S. FUNDING NUMBERS Validity of Integrity Tests for Predicting Drug and Alcohol Abuse C No. N00014-92-J...words) This research used psychometric meta-analysis (Hunter & Schmidt, 1990b) to examine the validity of integrity tests for predicting drug and

  2. Validation of a clinical critical thinking skills test in nursing.

    PubMed

    Shin, Sujin; Jung, Dukyoo; Kim, Sungeun

    2015-01-27

    The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability.

  3. Utilizing population controls in rare-variant case-parent association tests.

    PubMed

    Jiang, Yu; Satten, Glen A; Han, Yujun; Epstein, Michael P; Heinzen, Erin L; Goldstein, David B; Allen, Andrew S

    2014-06-05

    There is great interest in detecting associations between human traits and rare genetic variation. To address the low power implicit in single-locus tests of rare genetic variants, many rare-variant association approaches attempt to accumulate information across a gene, often by taking linear combinations of single-locus contributions to a statistic. Using the right linear combination is key-an optimal test will up-weight true causal variants, down-weight neutral variants, and correctly assign the direction of effect for causal variants. Here, we propose a procedure that exploits data from population controls to estimate the linear combination to be used in an case-parent trio rare-variant association test. Specifically, we estimate the linear combination by comparing population control allele frequencies with allele frequencies in the parents of affected offspring. These estimates are then used to construct a rare-variant transmission disequilibrium test (rvTDT) in the case-parent data. Because the rvTDT is conditional on the parents' data, using parental data in estimating the linear combination does not affect the validity or asymptotic distribution of the rvTDT. By using simulation, we show that our new population-control-based rvTDT can dramatically improve power over rvTDTs that do not use population control information across a wide variety of genetic architectures. It also remains valid under population stratification. We apply the approach to a cohort of epileptic encephalopathy (EE) trios and find that dominant (or additive) inherited rare variants are unlikely to play a substantial role within EE genes previously identified through de novo mutation studies. Copyright © 2014 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  4. Development and Validity Testing of an Arthritis Self-Management Assessment Tool.

    PubMed

    Oh, HyunSoo; Han, SunYoung; Kim, SooHyun; Seo, WhaSook

    Because of the chronic, progressive nature of arthritis and the substantial effects it has on quality of life, patients may benefit from self-management. However, no valid, reliable self-management assessment tool has been devised for patients with arthritis. This study was conducted to develop a comprehensive self-management assessment tool for patients with arthritis, that is, the Arthritis Self-Management Assessment Tool (ASMAT). To develop a list of qualified items corresponding to the conceptual definitions and attributes of arthritis self-management, a measurement model was established on the basis of theoretical and empirical foundations. Content validity testing was conducted to evaluate whether listed items were suitable for assessing arthritis self-management. Construct validity and reliability of the ASMAT were tested. Construct validity was examined using confirmatory factor analysis and nomological validity. The 32-item ASMAT was developed with a sample composed of patients in a clinic in South Korea. Content validity testing validated the 32 items, which comprised medical (10 items), behavioral (13 items), and psychoemotional (9 items) management subscales. Construct validity testing of the ASMAT showed that the 32 items properly corresponded with conceptual constructs of arthritis self-management, and were suitable for assessing self-management ability in patients with arthritis. Reliability was also well supported. The ASMAT devised in the present study may aid the evaluation of patient self-management ability and the effectiveness of self-management interventions. The authors believe the developed tool may also aid the identification of problems associated with the adoption of self-management practice, and thus improve symptom management, independence, and quality of life of patients with arthritis.

  5. LADO as a Language Test: Issues of Validity

    ERIC Educational Resources Information Center

    McNamara, Tim; Van Den Hazelkamp, Carolien; Verrips, Maaike

    2016-01-01

    This article brings together the theoretical field of language testing and the practical field of language analysis for the determination of the origin of asylum seekers. It considers what it would mean to think of language analysis as a form of language test, subject to the same validity constraints, and proposes a research agenda.

  6. Two-Speed Gearbox Dynamic Simulation Predictions and Test Validation

    NASA Technical Reports Server (NTRS)

    Lewicki, David G.; DeSmidt, Hans; Smith, Edward C.; Bauman, Steven W.

    2010-01-01

    Dynamic simulations and experimental validation tests were performed on a two-stage, two-speed gearbox as part of the drive system research activities of the NASA Fundamental Aeronautics Subsonics Rotary Wing Project. The gearbox was driven by two electromagnetic motors and had two electromagnetic, multi-disk clutches to control output speed. A dynamic model of the system was created which included a direct current electric motor with proportional-integral-derivative (PID) speed control, a two-speed gearbox with dual electromagnetically actuated clutches, and an eddy current dynamometer. A six degree-of-freedom model of the gearbox accounted for the system torsional dynamics and included gear, clutch, shaft, and load inertias as well as shaft flexibilities and a dry clutch stick-slip friction model. Experimental validation tests were performed on the gearbox in the NASA Glenn gear noise test facility. Gearbox output speed and torque as well as drive motor speed and current were compared to those from the analytical predictions. The experiments correlate very well with the predictions, thus validating the dynamic simulation methodologies.

  7. K3EDTA Vacuum Tubes Validation for Routine Hematological Testing

    PubMed Central

    Lima-Oliveira, Gabriel; Lippi, Giuseppe; Salvagno, Gian Luca; Montagnana, Martina; Poli, Giovanni; Solero, Giovanni Pietro; Picheth, Geraldo; Guidi, Gian Cesare

    2012-01-01

    Background and Objective. Some in vitro diagnostic devices (e.g, blood collection vacuum tubes and syringes for blood analyses) are not validated before the quality laboratory managers decide to start using or to change the brand. Frequently, the laboratory or hospital managers select the vacuum tubes for blood collection based on cost considerations or on relevance of a brand. The aim of this study was to validate two dry K3EDTA vacuum tubes of different brands for routine hematological testing. Methods. Blood specimens from 100 volunteers in two different K3EDTA vacuum tubes were collected by a single, expert phlebotomist. The routine hematological testing was done on Advia 2120i hematology system. The significance of the differences between samples was assessed by paired Student's t-test after checking for normality. The level of statistical significance was set at P < 0.05. Results and Conclusions. Different brand's tubes evaluated can represent a clinically relevant source of variations only on mean platelet volume (MPV) and platelet distribution width (PDW). Basically, our validation will permit the laboratory or hospital managers to select the brand's vacuum tubes validated according to him/her technical or economical reasons for routine hematological tests. PMID:22888448

  8. Embedded performance validity tests within the Hopkins Verbal Learning Test - Revised and the Brief Visuospatial Memory Test - Revised.

    PubMed

    Sawyer, R John; Testa, S Marc; Dux, Moira

    2017-01-01

    Various research studies and neuropsychology practice organizations have reiterated the importance of developing embedded performance validity tests (PVTs) to detect potentially invalid neurocognitive test data. This study investigated whether measures within the Hopkins Verbal Learning Test - Revised (HVLT-R) and the Brief Visuospatial Memory Test - Revised (BVMT-R) could accurately classify individuals who fail two or more PVTs during routine clinical assessment. The present sample of 109 United States military veterans (Mean age = 52.4, SD = 13.3), all consisted of clinically referred patients and received a battery of neuropsychological tests. Based on performance validity findings, veterans were assigned to valid (n = 86) or invalid (n = 23) groups. Of the 109 patients in the overall sample, 77 were administered the HLVT-R and 75 were administered the BVMT-R, which were examined for classification accuracy. The HVLT-R Recognition Discrimination Index and the BVMT-R Retention Percentage showed good to adequate discrimination with an area under the curve of .78 and .70, respectively. The HVLT-R Recognition Discrimination Index showed sensitivity of .53 with specificity of .93. The BVMT-R Retention Percentage demonstrated sensitivity of .31 with specificity of .92. When used in conjunction with other PVTs, these new embedded PVTs may be effective in the detection of invalid test data, although they are not intended for use in patients with dementia.

  9. The test-retest reliability and criterion validity of a high-intensity, netball-specific circuit test: The Net-Test.

    PubMed

    Mungovan, Sean F; Peralta, Paula J; Gass, Gregory C; Scanlan, Aaron T

    2018-04-12

    To examine the test-retest reliability and criterion validity of a high-intensity, netball-specific fitness test. Repeated measures, within-subject design. Eighteen female netball players competing in an international competition completed a trial of the Net-Test, which consists of 14 timed netball-specific movements. Players also completed a series of netball-relevant criterion fitness tests. Ten players completed an additional Net-Test trial one week later to assess test-retest reliability using intraclass correlation coefficient (ICC), typical error of measurement (TEM), and coefficient of variation (CV). The typical error of estimate expressed as CV and Pearson correlations were calculated between each criterion test and Net-Test performance to assess criterion validity. Five movements during the Net-Test displayed moderate ICC (0.84-0.90) and two movements displayed high ICC (0.91-0.93). Seven movements and heart rate taken during the Net-Test held low CV (<5%) with values ranging from 1.7 to 9.5% across measures. Total time (41.63±2.05s) during the Net-Test possessed low CV and significant (p<0.05) correlations with 10m sprint time (1.98±0.12s; CV=4.4%, r=0.72), 20m sprint time (3.38±0.19s; CV=3.9%, r=0.79), 505 Change-of-Direction time (2.47±0.08s; CV=2.0%, r=0.80); and maximum oxygen uptake (46.59±2.58 mLkg -1 min -1 ; CV=4.5%, r=-0.66). The Net-Test possesses acceptable reliability for the assessment of netball fitness. Further, the high criterion validity for the Net-Test suggests a range of important netball-specific fitness elements are assessed in combination. Copyright © 2018 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  10. Validity and reliability of the Hawaii anaerobic run test.

    PubMed

    Kimura, Iris F; Stickley, Christopher D; Lentz, Melissa A; Wages, Jennifer J; Yanagi, Kazuhiko; Hetzler, Ronald K

    2014-05-01

    This study examined the reliability and validity of the Hawaii anaerobic run test (HART) by comparing anaerobic capacity measures obtained to those during the Wingate Anaerobic Test (WAnT). Ninety-six healthy physically active volunteers (age, 22.0 ± 2.8 years; height, 163.9 ± 9.5 cm; body mass, 70.6 ± 14.7 kg; body fat %, 19.29 ± 5.39%) participated in this study. Each participant performed 2 anaerobic capacity tests: the WAnT and the HART by random assignment on separate days. The reliability of the HART was calculated from 2 separate trials of the test and then determined through intraclass correlation coefficients (ICCs). Blood samples were collected, and lactate was analyzed both pretest and posttest for each of the 2 exercise modes. Heart rate and rate of perceived exertion were also measured pre- and post-exercise. Hawaii anaerobic run test peak and mean momentum were calculated as body mass times highest or average split velocity, respectively. Intraclass correlation coefficients between trials of the HART for peak and mean momentum were 0.98 and 0.99, respectively (SEM = 18.8 and 25.7, respectively). Validity of the HART was established through comparison of momentum on the HART with power on the WAnT. High correlations were found between peak power and peak momentum (r = 0.88), as well as mean power and mean momentum (r = 0.94). The HART was considered to be a reliable test of anaerobic power. The HART was also determined to be a valid test of anaerobic power when compared with the WAnT. When testing healthy college-aged individuals, the HART offers an easy and inexpensive alternative maximal effort anaerobic power test to other established tests.

  11. Test-Retest Reliability and Predictive Validity of the Implicit Association Test in Children

    ERIC Educational Resources Information Center

    Rae, James R.; Olson, Kristina R.

    2018-01-01

    The Implicit Association Test (IAT) is increasingly used in developmental research despite minimal evidence of whether children's IAT scores are reliable across time or predictive of behavior. When test-retest reliability and predictive validity have been assessed, the results have been mixed, and because these studies have differed on many…

  12. Pump CFD code validation tests

    NASA Technical Reports Server (NTRS)

    Brozowski, L. A.

    1993-01-01

    Pump CFD code validation tests were accomplished by obtaining nonintrusive flow characteristic data at key locations in generic current liquid rocket engine turbopump configurations. Data were obtained with a laser two-focus (L2F) velocimeter at scaled design flow. Three components were surveyed: a 1970's-designed impeller, a 1990's-designed impeller, and a four-bladed unshrouded inducer. Two-dimensional velocities were measured upstream and downstream of the two impellers. Three-dimensional velocities were measured upstream, downstream, and within the blade row of the unshrouded inducer.

  13. Cross-Validation of the Computerized Adaptive Screening Test (CAST).

    ERIC Educational Resources Information Center

    Pliske, Rebecca M.; And Others

    The Computerized Adaptive Screening Test (CAST) was developed to provide an estimate at recruiting stations of prospects' Armed Forces Qualification Test (AFQT) scores. The CAST was designed to replace the paper-and-pencil Enlistment Screening Test (EST). The initial validation study of CAST indicated that CAST predicts AFQT at least as accurately…

  14. Validity and Reliability of the Alcohol, Smoking, and Substance Involvement Screening Test (ASSIST) in University Students.

    PubMed

    Tiburcio Sainz, Marcela; Rosete-Mohedano, Ma Guadalupe; Natera Rey, Guillermina; Martínez Vélez, Nora Angélica; Carreño García, Silvia; Pérez Cisneros, Daniel

    2016-03-02

    The Alcohol, Smoking and Substance Involvement Screening Test (ASSIST), developed by the World Health Organization (WHO), has been used successfully in many countries, but there are few studies of its validity and reliability for the Mexican population. The objective of this study was to determine the psychometric properties of the self-administered ASSIST test in university students in Mexico. This was an ex post facto non-experimental study with 1,176 undergraduate students, the majority women (70.1%) aged 18-23 years (89.5%) and single (87.5%). To estimate concurrent validity, factor analysis and tests of reliability and correlation were carried out between the subscale for alcohol and AUDIT, those for tobacco and the Fagerström Test, and those for marijuana and DAST-20. Adequate reliability coefficients were obtained for ASSIST subscales for tobacco (alpha = 0.83), alcohol (alpha = 0.76), and marijuana (alpha = 0.73). Significant correlations were found only with the AUDIT (r = 0.71) and the alcohol subscale. The best balance of sensitivity and specificity of the alcohol subscale (83.8% and 80%, respectively) and the largest area under the ROC curve (81.9%) was found with a cutoff score of 8. The self-administered version of ASSIST is a valid screening instrument to identify at-risk cases due to substance use in this population.

  15. Translation, Cultural Adaptation and Validation of the Simple Shoulder Test to Spanish

    PubMed Central

    Arcuri, Francisco; Barclay, Fernando; Nacul, Ivan

    2015-01-01

    Background: The validation of widely used scales facilitates the comparison across international patient samples. Objective: The objective was to translate, culturally adapt and validate the Simple Shoulder Test into Argentinian Spanish. Methods: The Simple Shoulder Test was translated from English into Argentinian Spanish by two independent translators, translated back into English and evaluated for accuracy by an expert committee to correct the possible discrepancies. It was then administered to 50 patients with different shoulder conditions.Psycometric properties were analyzed including internal consistency, measured with Cronbach´s Alpha, test-retest reliability at 15 days with the interclass correlation coefficient. Results: The internal consistency, validation, was an Alpha of 0,808, evaluated as good. The test-retest reliability index as measured by intra-class correlation coefficient (ICC) was 0.835, evaluated as excellent. Conclusion: The Simple Shoulder Test translation and it´s cultural adaptation to Argentinian-Spanish demonstrated adequate internal reliability and validity, ultimately allowing for its use in the comparison with international patient samples.

  16. Physical performance tests after stroke: reliability and validity.

    PubMed

    Maeda, A; Yuasa, T; Nakamura, K; Higuchi, S; Motohashi, Y

    2000-01-01

    To evaluate the reliability and validity of the modified physical performance tests for stroke survivors who live in a community. The subjects included 40 stroke survivors and 40 apparently healthy independent elderly persons. The physical performance tests for the stroke survivors comprised two physical capacity evaluation tasks that represented physical abilities necessary to perform the main activities of daily living, e.g., standing-up ability (time needed to stand up from bed rest) and walking ability (time needed to walk 10 m). Regarding the reliability of tests, significant correlations were confirmed between test and retest of physical performance tests with both short and long intervals in individuals after stroke. Regarding the validity of tests, the authors studied the significant correlations between the maximum isometric strength of the quardriceps muscle and the time needed to walk 10 m, centimeters reached while sitting and reaching, and the time needed to stand up from bed rest. The authors confirmed that there were significant correlations between the instrumental activity of daily living and the time needed to stand up from bed rest, along with the time needed to walk 10 m for the stroke survivors. These physical performance tests are useful guides for evaluating a level of activity of daily living and physical frailty of stroke survivors living in a community.

  17. Validation of the Narrowing Beam Walking Test in Lower Limb Prosthesis Users.

    PubMed

    Sawers, Andrew; Hafner, Brian

    2018-04-11

    To evaluate the content, construct, and discriminant validity of the Narrowing Beam Walking Test (NBWT), a performance-based balance test for lower limb prosthesis users. Cross-sectional study. Research laboratory and prosthetics clinic. Unilateral transtibial and transfemoral prosthesis users (N=40). Not applicable. Content validity was examined by quantifying the percentage of participants receiving maximum or minimum scores (ie, ceiling and floor effects). Convergent construct validity was examined using correlations between participants' NBWT scores and scores or times on existing clinical balance tests regularly administered to lower limb prosthesis users. Known-groups construct validity was examined by comparing NBWT scores between groups of participants with different fall histories, amputation levels, amputation etiologies, and functional levels. Discriminant validity was evaluated by analyzing the area under each test's receiver operating characteristic (ROC) curve. No minimum or maximum scores were recorded on the NBWT. NBWT scores demonstrated strong correlations (ρ=.70‒.85) with scores/times on performance-based balance tests (timed Up and Go test, Four Square Step Test, and Berg Balance Scale) and a moderate correlation (ρ=.49) with the self-report Activities-specific Balance Confidence scale. NBWT performance was significantly lower among participants with a history of falls (P=.003), transfemoral amputation (P=.011), and a lower mobility level (P<.001). The NBWT also had the largest area under the ROC curve (.81) and was the only test to exhibit an area that was statistically significantly >.50 (ie, chance). The results provide strong evidence of content, construct, and discriminant validity for the NBWT as a performance-based test of balance ability. The evidence supports its use to assess balance impairments and fall risk in unilateral transtibial and transfemoral prosthesis users. Copyright © 2018 American Congress of Rehabilitation Medicine

  18. Validation of a clinical critical thinking skills test in nursing

    PubMed Central

    2015-01-01

    Purpose: The purpose of this study was to develop a revised version of the clinical critical thinking skills test (CCTS) and to subsequently validate its performance. Methods: This study is a secondary analysis of the CCTS. Data were obtained from a convenience sample of 284 college students in June 2011. Thirty items were analyzed using item response theory and test reliability was assessed. Test-retest reliability was measured using the results of 20 nursing college and graduate school students in July 2013. The content validity of the revised items was analyzed by calculating the degree of agreement between instrument developer intention in item development and the judgments of six experts. To analyze response process validity, qualitative data related to the response processes of nine nursing college students obtained through cognitive interviews were analyzed. Results: Out of initial 30 items, 11 items were excluded after the analysis of difficulty and discrimination parameter. When the 19 items of the revised version of the CCTS were analyzed, levels of item difficulty were found to be relatively low and levels of discrimination were found to be appropriate or high. The degree of agreement between item developer intention and expert judgments equaled or exceeded 50%. Conclusion: From above results, evidence of the response process validity was demonstrated, indicating that subjects respondeds as intended by the test developer. The revised 19-item CCTS was found to have sufficient reliability and validity and will therefore represents a more convenient measurement of critical thinking ability. PMID:25622716

  19. Validation of Alternative In Vitro Methods to Animal Testing: Concepts, Challenges, Processes and Tools.

    PubMed

    Griesinger, Claudius; Desprez, Bertrand; Coecke, Sandra; Casey, Warren; Zuang, Valérie

    This chapter explores the concepts, processes, tools and challenges relating to the validation of alternative methods for toxicity and safety testing. In general terms, validation is the process of assessing the appropriateness and usefulness of a tool for its intended purpose. Validation is routinely used in various contexts in science, technology, the manufacturing and services sectors. It serves to assess the fitness-for-purpose of devices, systems, software up to entire methodologies. In the area of toxicity testing, validation plays an indispensable role: "alternative approaches" are increasingly replacing animal models as predictive tools and it needs to be demonstrated that these novel methods are fit for purpose. Alternative approaches include in vitro test methods, non-testing approaches such as predictive computer models up to entire testing and assessment strategies composed of method suites, data sources and decision-aiding tools. Data generated with alternative approaches are ultimately used for decision-making on public health and the protection of the environment. It is therefore essential that the underlying methods and methodologies are thoroughly characterised, assessed and transparently documented through validation studies involving impartial actors. Importantly, validation serves as a filter to ensure that only test methods able to produce data that help to address legislative requirements (e.g. EU's REACH legislation) are accepted as official testing tools and, owing to the globalisation of markets, recognised on international level (e.g. through inclusion in OECD test guidelines). Since validation creates a credible and transparent evidence base on test methods, it provides a quality stamp, supporting companies developing and marketing alternative methods and creating considerable business opportunities. Validation of alternative methods is conducted through scientific studies assessing two key hypotheses, reliability and relevance of the

  20. Implementation and Initial Validation of the APS English Test [and] The APS English-Writing Test at Golden West College: Evidence for Predictive Validity.

    ERIC Educational Resources Information Center

    Isonio, Steven

    In May 1991, Golden West College (California) conducted a validation study of the English portion of the Assessment and Placement Services for Community Colleges (APS), followed by a predictive validity study in July 1991. The initial study was designed to aid in the implementation of the new test at GWC by comparing data on APS use at other…

  1. Validation of a surveillance case definition for arthritis.

    PubMed

    Sacks, Jeffrey J; Harrold, Leslie R; Helmick, Charles G; Gurwitz, Jerry H; Emani, Srinivas; Yood, Robert A

    2005-02-01

    To assess whether self-reports of chronic joint symptoms or doctor-diagnosed arthritis can validly identify persons with clinically verifiable arthritis. The Behavioral Risk Factor Surveillance System (BRFSS), a telephone health survey, defines a case of arthritis as a self-report of chronic joint symptoms (CJS) and/or doctor-diagnosed arthritis (DDx). A sample of health plan enrollees aged 45-64 years and >/= 65 years with upcoming annual physical examinations were surveyed by telephone using the 2002 BRFSS CJS and DDx questions. Based on responses (CJS+, DDx-; CJS-, DDx+; CJS+, DDx+; CJS-, DDx-), respondents were recruited to undergo a standardized clinical history and physical examination for arthritis (the gold standard for clinical validation). Weighted sensitivities and specificities of the case definition were calculated to adjust for sampling. Of 2180 persons completing the telephone questionnaire, 389 were examined; of these, 258 met the case definition and 131 did not. For those examined and aged 45 to 64 years (n = 179), 96 persons had arthritis confirmed, of whom 76 met the case definition. Among those examined and aged >/= 65 (n = 210), 150 had arthritis confirmed, of whom 124 met the case definition. Among those without clinical arthritis, 45 of 83 of those aged 45 to 64 years and 40 of 60 of those aged >/= 65 did not meet the case definition. For those aged 45 to 64 years, the weighted sensitivity of the case definition in this sample was 77.4% and the weighted specificity was 58.8%; for those aged >/= 65, the sensitivity was 83.6% and specificity 70.6%. CJS+ had higher sensitivity and lower specificity than DDx+ in the younger age group; CJS+ and DDx+ behaved more comparably in the older age group. The case definition based on self-reported CJS and/or DDx appeared to be sensitive in identifying arthritis, but specificity was lower than desirable for those under age 65 years. Better methods of ascertaining arthritis by self-report are needed. Until

  2. The Korean version of the Sniffin' stick (KVSS) test and its validity in comparison with the cross-cultural smell identification test (CC-SIT).

    PubMed

    Cho, Jae Hoon; Jeong, Yong Soo; Lee, Yeo Jin; Hong, Seok-Chan; Yoon, Joo-Heon; Kim, Jin Kook

    2009-06-01

    The Korean Version of the Sniffin' stick (KVSS) is the first olfactory test for Koreans. Although we adopted the Sniffin' Stick, we modified it to make it more suitable for Koreans. KVSS I is a screening test, and KVSS II a more comprehensive test. The aims of this study were to apply the KVSS test and assess its clinical validity and reliability in comparison to CC-SIT. One hundred and seventy-four healthy volunteers and 206 patients with subjective decreased olfaction participated. Each participant was tested with both the CC-SIT and KVSS tests and then the correlation between these two tests was analyzed. The correlation between CC-SIT and KVSS I was 0.720 (p<0.01) and 0.714 between the CC-SIT and KVSS II total scores (p<0.01). When the degree of olfaction based on the KVSS I was used, the mean CC-SIT score was 8.6+/-1.8 for normosmia, 7.3+/-2.2 for hyposmia, and 4.2+/-2.3 for anosmia. When the KVSS II total was applied, the mean CC-SIT score was 8.4+/-1.8 for normosmia, 7.3+/-2.0 for hyposmia, and 3.7+/-2.0 for anosmia. The means of the three group differed significantly in both cases (p<0.01). Thus, the KVSS test demonstrates validity and reliability for Korean in comparison with CC-SIT.

  3. Construct Validity of Physical Fitness Tests

    DTIC Science & Technology

    2011-02-03

    Medicine and Science in Sports and Exercise , 21, 319-324. *Fleishman, E. A. (1964). The structure and measurement of physical fitness. Englewood Cliffs...Quarterly for Exercise and Sport, 64, 256-273. *McCloy, E. (1935). Factor analysis methods in the measurement of physical abilities. Research Quarterly...Research Quarterly, 34, 525. Physical Fitness Test Validity 23 Powers, S. K., & Howley, E. T. (1990). Exercise physiology: Theory and application to

  4. FAST Model Calibration and Validation of the OC5- DeepCwind Floating Offshore Wind System Against Wave Tank Test Data: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wendt, Fabian F; Robertson, Amy N; Jonkman, Jason

    During the course of the Offshore Code Comparison Collaboration, Continued, with Correlation (OC5) project, which focused on the validation of numerical methods through comparison against tank test data, the authors created a numerical FAST model of the 1:50-scale DeepCwind semisubmersible system that was tested at the Maritime Research Institute Netherlands ocean basin in 2013. This paper discusses several model calibration studies that were conducted to identify model adjustments that improve the agreement between the numerical simulations and the experimental test data. These calibration studies cover wind-field-specific parameters (coherence, turbulence), hydrodynamic and aerodynamic modeling approaches, as well as rotor model (blade-pitchmore » and blade-mass imbalances) and tower model (structural tower damping coefficient) adjustments. These calibration studies were conducted based on relatively simple calibration load cases (wave only/wind only). The agreement between the final FAST model and experimental measurements is then assessed based on more-complex combined wind and wave validation cases.« less

  5. Development of an Agility Test for Badminton Players and Assessment of Its Validity and Test-Retest Reliability.

    PubMed

    Loureiro, Luiz de França Bahia; de Freitas, Paulo Barbosa

    2016-04-01

    Badminton requires open and fast actions toward the shuttlecock, but there is no specific agility test for badminton players with specific movements. To develop an agility test that simultaneously assesses perception and motor capacity and examine the test's concurrent and construct validity and its test-retest reliability. The Badcamp agility test consists of running as fast as possible to 6 targets placed on the corners and middle points of a rectangular area (5.6 × 4.2 m) from the start position located in the center of it, following visual stimuli presented in a luminous panel. The authors recruited 43 badminton players (17-32 y old) to evaluate concurrent (with shuttle-run agility test--SRAT) and construct validity and test-retest reliability. Results revealed that Badcamp presents concurrent and construct validity, as its performance is strongly related to SRAT (ρ = 0.83, P < .001), with performance of experts being better than nonexpert players (P < .01). In addition, Badcamp is reliable, as no difference (P = .07) and a high intraclass correlation (ICC = .93) were found in the performance of the players on 2 different occasions. The findings indicate that Badcamp is an effective, valid, and reliable tool to measure agility, allowing coaches and athletic trainers to evaluate players' athletic condition and training effectiveness and possibly detect talented individuals in this sport.

  6. Development and validation of a knowledge test for health professionals regarding lifestyle modification.

    PubMed

    Talip, Whadi-ah; Steyn, Nelia P; Visser, Marianne; Charlton, Karen E; Temple, Norman

    2003-09-01

    We wanted to develop and validate a test that assesses the knowledge and practices of health professionals (HPs) with regard to the role of nutrition, physical activity, and smoking cessation (lifestyle modification) in chronic diseases of lifestyle. A descriptive cross-sectional validation study was carried out. The validation design consisted of two phases, namely 1) test planning and development and 2) test evaluation. The study sample consisted of five groups of HPs: dietitians, dietetic interns, general practitioners, medical students, and nurses. The overall response rate was 58%, resulting in a sample size of 186 participants. A test was designed to evaluate the knowledge and practices of HPs. The test was first evaluated by an expert group to ensure content, construct, and face validity. Thereafter, the questionnaire was tested on five groups of HPs to test for criterion validity. Internal consistency was evaluated by Cronbach's alpha. An expert panel ensured content, construct, and face validity of the test. Groups with the most training and exposure to nutrition (dietitians and dietetic interns) had the highest group mean score, ranging from 61% to 88%, whereas those with limited nutrition training (general practitioners, medical students, and nurses) had significantly lower scores, ranging from 26% to 80%. This result demonstrated criterion validity. Internal consistency of the overall test demonstrated a Cronbach's alpha of 0.99. Most HPs identified the mass media as their main source of information on lifestyle modification. These HPs also identified lack of time, lack of patient compliance, and lack of knowledge as barriers that prevent them from providing counseling on lifestyle modification. The results of this study showed that this test instrument identifies groups of health professionals with adequate training (knowledge) in lifestyle modification and those who require further training (knowledge).

  7. 49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2013-10-01 2013-10-01 false What is validity testing, and are laboratories...

  8. 49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2011-10-01 2011-10-01 false What is validity testing, and are laboratories...

  9. 49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2010-10-01 2010-10-01 false What is validity testing, and are laboratories...

  10. 49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2012-10-01 2012-10-01 false What is validity testing, and are laboratories...

  11. 49 CFR 40.89 - What is validity testing, and are laboratories required to conduct it?

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... PROCEDURES FOR TRANSPORTATION WORKPLACE DRUG AND ALCOHOL TESTING PROGRAMS Drug Testing Laboratories § 40.89 What is validity testing, and are laboratories required to conduct it? (a) Specimen validity testing is... 49 Transportation 1 2014-10-01 2014-10-01 false What is validity testing, and are laboratories...

  12. The Unified Language Testing Plan: Speaking Proficiency Test. Russian Pilot Validation Studies. Report Number 2.

    ERIC Educational Resources Information Center

    Thornton, Julie A.

    The report describes one segment of the Federal Language Testing Board's Unified Language Testing Plan (ULTP), the validation of the speaking proficiency test in Russian. The ULTP is a project to increase standardization of foreign language proficiency measurement and promote sharing of resources among testing programs in the federal government.…

  13. Validation of the Arabic Version of the Internet Gaming Disorder-20 Test.

    PubMed

    Hawi, Nazir S; Samaha, Maya

    2017-04-01

    In recent years, researchers have been trying to shed light on gaming addiction and its association with different psychiatric disorders and psychological determinants. The latest edition version of the American Psychiatric Association's Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) included in its Section 3 Internet Gaming Disorder (IGD) as a condition for further empirical study and proposed nine criteria for the diagnosis of IGD. The 20-item Internet Gaming Disorder (IGD-20) Test was developed as a valid and reliable tool to assess gaming addiction based on the nine criteria set by the DSM-5. The aim of this study is to validate an Arabic version of the IGD-20 Test. The Arabic version of IGD-20 will not only help in identifying Arabic-speaking pathological gamers but also stimulate cross-cultural studies that could contribute to an area in need of more research for insight and treatment. After a process of translation and back-translation and with the participation of a sizable sample of Arabic-speaking adolescents, the present study conducted a psychometric validation of the IGD-20 Test. Our confirmatory factor analysis showed the validity of the Arabic version of the IGD-20 Test. The one-factor model of the Arabic IGD-20 Test had very good psychometric properties, and it fitted the sample data extremely well. In addition, correlation analysis between the IGD-20 Test and the daily duration on weekdays and weekends gameplay revealed significant positive relationships that warranted a criterion-related validation. Thus, the Arabic version of the IGD-20 Test is a valid and reliable measure of IGD among Arabic-speaking populations.

  14. Validity and Reliability Testing of an e-learning Questionnaire for Chemistry Instruction

    NASA Astrophysics Data System (ADS)

    Guspatni, G.; Kurniawati, Y.

    2018-04-01

    The aim of this paper is to examine validity and reliability of a questionnaire used to evaluate e-learning implementation in chemistry instruction. 48 questionnaires were filled in by students who had studied chemistry through e-learning system. The questionnaire consisted of 20 indicators evaluating students’ perception on using e-learning. Parametric testing was done as data were assumed to follow normal distribution. Item validity of the questionnaire was examined through item-total correlation using Pearson’s formula while its reliability was assessed with Cronbach’s alpha formula. Moreover, convergent validity was assessed to see whether indicators building a factor had theoretically the same underlying construct. The result of validity testing revealed 19 valid indicators while the result of reliability testing revealed Cronbach’s alpha value of .886. The result of factor analysis showed that questionnaire consisted of five factors, and each of them had indicators building the same construct. This article shows the importance of factor analysis to get a construct valid questionnaire before it is used as research instrument.

  15. Establishing the Test-Retest Reliability & Concurrent Validity for the Repeat Ice Skating Test (RIST) in Adolescent Male Ice Hockey Players

    ERIC Educational Resources Information Center

    Power, Allan; Faught, Brent E.; Przysucha, Eryk; McPherson, Moira; Montelpare, William

    2012-01-01

    In this study the authors examine the test-retest reliability and concurrent validity of the Repeat Ice Skating Test (RIST). This was an on-ice field anaerobic test that measured average peak power and was validated with 3 anaerobic lab tests: (a) vertical jump, (b) the Margaria-Kalamen stair test, and (c) the Wingate Anaerobic Test. The…

  16. Construct validity of the Health Science Reasoning Test.

    PubMed

    Huhn, Karen; Black, Lisa; Jensen, Gail M; Deutsch, Judith E

    2011-01-01

    The aim of this study was to evaluate the construct validity of the Health Science Reasoning Test (HSRT) by determining if the test could discriminate between expert and novice physical therapists' critical-thinking skills. Experts identified from a random list of certified clinical specialists and students in the first year of their physical therapy education from two physical therapy programs completed the HSRT. Experts (n = 73) had a higher total HSRT score (mean 24.06, SD 3.92) than the novices (n = 79) (mean 22.49, SD 3.2), with the difference being statistically significant t (148) = 2.67, p = 0.008. The HSRT total score discriminated between expert and novice critical-thinking skills, therefore establishing construct validity. To our knowledge, this is the first study to compare expert and novice performance on a standardized test. The opportunity to have a tool that provides evidence of students' critical thinking skills could be helpful for educators and students. The test results could aid in identifying areas of students' strengths and weaknesses, thereby enabling targeted remediation to improve critical thinking skills, which are key factors in clinical reasoning, a necessary skill for effective physical therapy practice.

  17. Validity of an Interactive Functional Reach Test.

    PubMed

    Galen, Sujay S; Pardo, Vicky; Wyatt, Douglas; Diamond, Andrew; Brodith, Victor; Pavlov, Alex

    2015-08-01

    Videogaming platforms such as the Microsoft (Redmond, WA) Kinect(®) are increasingly being used in rehabilitation to improve balance performance and mobility. These gaming platforms do not have built-in clinical measures that offer clinically meaningful data. We have now developed software that will enable the Kinect sensor to assess a patient's balance using an interactive functional reach test (I-FRT). The aim of the study was to test the concurrent validity of the I-FRT and to establish the feasibility of implementing the I-FRT in a clinical setting. The concurrent validity of the I-FRT was tested among 20 healthy adults (mean age, 25.8±3.4 years; 14 women). The Functional Reach Test (FRT) was measured simultaneously by both the Kinect sensor using the I-FRT software and the Optotrak Certus(®) 3D motion-capture system (Northern Digital Inc., Waterloo, ON, Canada). The feasibility of implementing the I-FRT in a clinical setting was assessed by performing the I-FRT in 10 participants with mild balance impairments recruited from the outpatient physical therapy clinic (mean age, 55.8±13.5 years; four women) and obtaining their feedback using a NASA Task Load Index (NASA-TLX) questionnaire. There was moderate to good agreement between FRT measures made by the two measurement systems. The greatest agreement between the two measurement system was found with the Kinect sensor placed at a distance of 2.5 m [intraclass correlation coefficient (2,k)=0.786; P<0.001] from the participant. Participants with mild balance impairments whose balance was assessed using the I-FRT software scored their experience favorably by assigning lower scores for the Frustration, Mental Demand, and Temporal Demand subscales on the NASA/TLX questionnaire. FRT measures made using the Kinect sensor I-FRT software provides a valid clinical measure that can be used with the gaming platforms.

  18. Beyond Faith and Face Validity: The Multitrait-Multimethod Matrix and the Convergent and Discriminant Validity of Oral Proficiency Tests.

    ERIC Educational Resources Information Center

    Stevenson, Douglas K.

    Recently there has been a renewed international interest in direct oral proficiency measures such as the oral interview. There has also been a growing awareness among some language testing specialists that all proficiency tests must be subjected to construct validation. It seems that the high face validity of oral interviews tends to cloud and…

  19. Reliability and validity of the revised Gibson Test of Cognitive Skills, a computer-based test battery for assessing cognition across the lifespan.

    PubMed

    Moore, Amy Lawson; Miller, Terissa M

    2018-01-01

    The purpose of the current study is to evaluate the validity and reliability of the revised Gibson Test of Cognitive Skills, a computer-based battery of tests measuring short-term memory, long-term memory, processing speed, logic and reasoning, visual processing, as well as auditory processing and word attack skills. This study included 2,737 participants aged 5-85 years. A series of studies was conducted to examine the validity and reliability using the test performance of the entire norming group and several subgroups. The evaluation of the technical properties of the test battery included content validation by subject matter experts, item analysis and coefficient alpha, test-retest reliability, split-half reliability, and analysis of concurrent validity with the Woodcock Johnson III Tests of Cognitive Abilities and Tests of Achievement. Results indicated strong sources of evidence of validity and reliability for the test, including internal consistency reliability coefficients ranging from 0.87 to 0.98, test-retest reliability coefficients ranging from 0.69 to 0.91, split-half reliability coefficients ranging from 0.87 to 0.91, and concurrent validity coefficients ranging from 0.53 to 0.93. The Gibson Test of Cognitive Skills-2 is a reliable and valid tool for assessing cognition in the general population across the lifespan.

  20. Validation of administrative case ascertainment algorithms for chronic childhood arthritis in Manitoba, Canada.

    PubMed

    Shiff, Natalie Jane; Oen, Kiem; Rabbani, Rasheda; Lix, Lisa M

    2017-09-01

    We validated case ascertainment algorithms for juvenile idiopathic arthritis (JIA) in the provincial health administrative databases of Manitoba, Canada. A population-based pediatric rheumatology clinical database from April 1st 1980 to March 31st 2012 was used to test case definitions in individuals diagnosed at ≤15 years of age. The case definitions varied the number of diagnosis codes (1, 2, or 3), time frame (1, 2 or 3 years), time between diagnoses (ever, >1 day, or ≥8 weeks), and physician specialty. Positive predictive value (PPV), sensitivity, and specificity with 95% confidence intervals (CIs) are reported. A case definition of 1 hospitalization or ≥2 diagnoses in 2 years by any provider ≥8 weeks apart using diagnosis codes for rheumatoid arthritis and ankylosing spondylitis produced a sensitivity of 89.2% (95% CI 86.8, 91.6), specificity of 86.3% (95% CI 83.0, 89.6), and PPV of 90.6% (95% CI 88.3, 92.9) when seronegative enthesopathy and arthropathy (SEA) was excluded as JIA; and sensitivity of 88.2% (95% CI 85.7, 90.7), specificity of 90.4% (95% CI 87.5, 93.3), and PPV of 93.9% (95% CI 92.0, 95.8) when SEA was included as JIA. This study validates case ascertainment algorithms for JIA in Canadian administrative health data using diagnosis codes for both rheumatoid arthritis (RA) and ankylosing spondylitis, to better reflect current JIA classification than codes for RA alone. Researchers will be able to use these results to define cohorts for population-based studies.

  1. Reliability and validity of an audio signal modified shuttle walk test.

    PubMed

    Singla, Rupak; Rai, Richa; Faye, Abhishek Anil; Jain, Anil Kumar; Chowdhury, Ranadip; Bandyopadhyay, Debdutta

    2017-01-01

    The audio signal in the conventionally accepted protocol of shuttle walk test (SWT) is not well-understood by the patients and modification of the audio signal may improve the performance of the test. The aim of this study is to study the validity and reliability of an audio signal modified SWT, called the Singla-Richa modified SWT (SWTSR), in healthy normal adults. In SWTSR, the audio signal was modified with the addition of reverse counting to it. A total of 54 healthy normal adults underwent conventional SWT (CSWT) at one instance and two times SWTSRon the same day. The validity was assessed by comparing outcomes of the SWTSRto outcomes of CSWT using the Pearson correlation coefficient and Bland-Altman plot. Test-retest reliability of SWTSRwas assessed using the intraclass correlation coefficient (ICC). The acceptability of the modified test in comparison to the conventional test was assessed using Likert scale. The distance walked (mean ± standard deviation) in the CSWT and SWTSRtest was 853.33 ± 217.33 m and 857.22 ± 219.56 m, respectively (Pearson correlation coefficient - 0.98; P < 0.001) indicating SWTSRto be a valid test. The SWTSRwas found to be a reliable test with ICC of 0.98 (95% confidence interval: 0.97-0.99). The acceptability of SWTSRwas significantly higher than CSWT. The SWTSRwith modified audio signal with reverse counting is a reliable as well as a valid test when compared with CSWT in healthy normal adults. It better understood by subjects compared to CSWT.

  2. Validation of a research case definition of Gulf War illness in the 1991 US military population.

    PubMed

    Iannacchione, Vincent G; Dever, Jill A; Bann, Carla M; Considine, Kathleen A; Creel, Darryl; Carson, Christopher P; Best, Heather; Haley, Robert W

    2011-01-01

    A case definition of Gulf War illness with 3 primary variants, previously developed by factor analysis of symptoms in a US Navy construction battalion and validated in clinic veterans, identified ill veterans with objective abnormalities of brain function. This study tests prestated hypotheses of its external validity. A stratified probability sample (n = 8,020), selected from a sampling frame of the 3.5 million Gulf War era US military veterans, completed a computer-assisted telephone interview survey. Application of the prior factor weights to the subjects' responses generated the case definition. The structural equation model of the case definition fit both random halves of the population sample well (root mean-square error of approximation = 0.015). The overall case definition was 3.87 times (95% confidence interval, 2.61-5.74) more prevalent in the deployed than the deployable nondeployed veterans: 3.33 (1.10-10.10) for syndrome variant 1; 5.11 (2.43-10.75) for variant 2, and 4.25 (2.33-7.74) for variant 3. Functional status on SF-12 was greatly reduced (effect sizes, 1.0-2.0) in veterans meeting the overall and variant case definitions. The factor case definition applies to the full Gulf War veteran population and has good characteristics for research. Copyright © 2011 S. Karger AG, Basel.

  3. Reliability and Validity of the Standing Heel-Rise Test

    ERIC Educational Resources Information Center

    Yocum, Allison; McCoy, Sarah Westcott; Bjornson, Kristie F.; Mullens, Pamela; Burton, Gay Naganuma

    2010-01-01

    A standardized protocol for a pediatric heel-rise test was developed and reliability and validity are reported. Fifty-seven children developing typically (CDT) and 34 children with plantar flexion weakness performed three tests: unilateral heel rise, vertical jump, and force measurement using handheld dynamometry. Intraclass correlation…

  4. Cross-Cultural Validation of TEMAS, a Minority Projective Test.

    ERIC Educational Resources Information Center

    Costantino, Giuseppe; And Others

    The theoretical framework and cross-cultural validation of Tell-Me-A-Story (TEMAS), a projective test developed to measure personality development in ethnic minority children, is presented. The TEMAS test consists of 23 chromatic pictures which incorporate the following characteristics: (1) representation of antithetical concepts which the…

  5. Validation of Milliflex® Quantum for Bioburden Testing of Pharmaceutical Products.

    PubMed

    Gordon, Oliver; Goverde, Marcel; Staerk, Alexandra; Roesti, David

    2017-01-01

    This article reports the validation strategy used to demonstrate that the Milliflex ® Quantum yielded non-inferior results to the traditional bioburden method. It was validated according to USP <1223>, European Pharmacopoeia 5.1.6, and Parenteral Drug Association Technical Report No. 33 and comprised the validation parameters robustness, ruggedness, repeatability, specificity, limit of detection and quantification, accuracy, precision, linearity, range, and equivalence in routine operation. For the validation, a combination of pharmacopeial ATCC strains as well as a broad selection of in-house isolates were used. In-house isolates were used in stressed state. Results were statistically evaluated regarding the pharmacopeial acceptance criterion of ≥70% recovery compared to the traditional method. Post-hoc test power calculations verified the appropriateness of the used sample size to detect such a difference. Furthermore, equivalence tests verified non-inferiority of the rapid method as compared to the traditional method. In conclusion, the rapid bioburden on basis of the Milliflex ® Quantum was successfully validated as alternative method to the traditional bioburden test. LAY ABSTRACT: Pharmaceutical drug products must fulfill specified quality criteria regarding their microbial content in order to ensure patient safety. Drugs that are delivered into the body via injection, infusion, or implantation must be sterile (i.e., devoid of living microorganisms). Bioburden testing measures the levels of microbes present in the bulk solution of a drug before sterilization, and thus it provides important information for manufacturing a safe product. In general, bioburden testing has to be performed using the methods described in the pharmacopoeias (membrane filtration or plate count). These methods are well established and validated regarding their effectiveness; however, the incubation time required to visually identify microbial colonies is long. Thus, alternative

  6. Non-Nuclear Validation Test Results of a Closed Brayton Cycle Test-Loop

    NASA Astrophysics Data System (ADS)

    Wright, Steven A.

    2007-01-01

    Both NASA and DOE have programs that are investigating advanced power conversion cycles for planetary surface power on the moon or Mars, or for next generation nuclear power plants on earth. Although open Brayton cycles are in use for many applications (combined cycle power plants, aircraft engines), only a few closed Brayton cycles have been tested. Experience with closed Brayton cycles coupled to nuclear reactors is even more limited and current projections of Brayton cycle performance are based on analytic models. This report describes and compares experimental results with model predictions from a series of non-nuclear tests using a small scale closed loop Brayton cycle available at Sandia National Laboratories. A substantial amount of testing has been performed, and the information is being used to help validate models. In this report we summarize the results from three kinds of tests. These tests include: 1) test results that are useful for validating the characteristic flow curves of the turbomachinery for various gases ranging from ideal gases (Ar or Ar/He) to non-ideal gases such as CO2, 2) test results that represent shut down transients and decay heat removal capability of Brayton loops after reactor shut down, and 3) tests that map a range of operating power versus shaft speed curve and turbine inlet temperature that are useful for predicting stable operating conditions during both normal and off-normal operating behavior. These tests reveal significant interactions between the reactor and balance of plant. Specifically these results predict limited speed up behavior of the turbomachinery caused by loss of load, the conditions for stable operation, and for direct cooled reactors, the tests reveal that the coast down behavior during loss of power events can extend for hours provided the ultimate heat sink remains available.

  7. Concurrent validity and clinical usefulness of several individually administered tests of children's social-emotional cognition.

    PubMed

    McKown, Clark

    2007-03-01

    In this study, the validity of 5 tests of children's social-emotional cognition, defined as their encoding, memory, and interpretation of social information, was tested. Participants were 126 clinic-referred children between the ages of 5 and 17. All 5 tests were evaluated in terms of their (a) concurrent validity, (b) incremental validity, and (c) clinical usefulness in predicting social functioning. Tests included measures of nonverbal sensitivity, social language, and social problem solving. Criterion measures included parent and teacher report of social functioning. Analyses support the concurrent validity of all measures, and the incremental validity and clinical usefulness of tests of pragmatic language and problem solving.

  8. Reliability and validity of two isometric squat tests.

    PubMed

    Blazevich, Anthony J; Gill, Nicholas; Newton, Robert U

    2002-05-01

    The purpose of the present study was first to examine the reliability of isometric squat (IS) and isometric forward hack squat (IFHS) tests to determine if repeated measures on the same subjects yielded reliable results. The second purpose was to examine the relation between isometric and dynamic measures of strength to assess validity. Fourteen male subjects performed maximal IS and IFHS tests on 2 occasions and 1 repetition maximum (1-RM) free-weight squat and forward hack squat (FHS) tests on 1 occasion. The 2 tests were found to be highly reliable (intraclass correlation coefficient [ICC](IS) = 0.97 and ICC(IFHS) = 1.00). There was a strong relation between average IS and 1-RM squat performance, and between IFHS and 1-RM FHS performance (r(squat) = 0.77, r(FHS) = 0.76; p < 0.01), but a weak relation between squat and FHS test performances (r < 0.55). There was also no difference between observed 1-RM values and those predicted by our regression equations. Errors in predicting 1-RM performance were in the order of 8.5% (standard error of the estimate [SEE] = 13.8 kg) and 7.3% (SEE = 19.4 kg) for IS and IFHS respectively. Correlations between isometric and 1-RM tests were not of sufficient size to indicate high validity of the isometric tests. Together the results suggest that IS and IFHS tests could detect small differences in multijoint isometric strength between subjects, or performance changes over time, and that the scores in the isometric tests are well related to 1-RM performance. However, there was a small error when predicting 1-RM performance from isometric performance, and these tests have not been shown to discriminate between small changes in dynamic strength. The weak relation between squat and FHS test performance can be attributed to differences in the movement patterns of the tests

  9. Successful MPPF Pneumatics Verification and Validation Testing

    NASA Image and Video Library

    2017-03-28

    Engineers and technicians completed verification and validation testing of several pneumatic systems inside and outside the Multi-Payload Processing Facility (MPPF) at NASA's Kennedy Space Center in Florida. In view is the service platform for Orion spacecraft processing. The MPPF will be used for offline processing and fueling of the Orion spacecraft and service module stack before launch. Orion also will be de-serviced in the MPPF after a mission. The Ground Systems Development and Operations Program (GSDO) is overseeing upgrades to the facility. The Engineering Directorate led the recent pneumatic tests.

  10. Meta-Analysis of Integrity Tests: A Critical Examination of Validity Generalization and Moderator Variables

    DTIC Science & Technology

    1992-06-01

    AVA LABLLTY OF PEPOR’ 2b DECLASSfFiCATION DOWNGRADING SCHEDULE UnI imiited 4 PERFORMING ORGANZAT ON REPORT NUMBER(S) 5 MON’TORzNG ORGA% ZA C% RPEOR...8217 " S 92- 1 6a NAME OF PERFORMING ORGANIZATION 6b OFFPCE SYMBOL 7a NAME OF V0’O0R ’C OCGAz) ZA- %I University of Iowa (Ifappicable) Defense Personnel...data points. Results indicate that integrity test validities are positive and in many cases substantial for predicting both job performance and

  11. Applying Independent Verification and Validation to Automatic Test Equipment

    NASA Technical Reports Server (NTRS)

    Calhoun, Cynthia C.

    1997-01-01

    This paper describes a general overview of applying Independent Verification and Validation (IV&V) to Automatic Test Equipment (ATE). The overview is not inclusive of all IV&V activities that can occur or of all development and maintenance items that can be validated and verified, during the IV&V process. A sampling of possible IV&V activities that can occur within each phase of the ATE life cycle are described.

  12. The Michigan Alcoholism Screening Test (MAST): A Statistical Validation Analysis

    ERIC Educational Resources Information Center

    Laux, John M.; Newman, Isadore; Brown, Russ

    2004-01-01

    This study extends the Michigan Alcoholism Screening Test (MAST; M. L. Selzer, 1971) literature base by examining 4 issues related to the validity of the MAST scores. Specifically, the authors examine the validity of the MAST scores in light of the presence of impression management, participant demographic variables, and item endorsement…

  13. Validation of Linguistic and Communicative Oral Language Tests for Spanish-English Bilingual Programs.

    ERIC Educational Resources Information Center

    Politzer, Robert L.; And Others

    1983-01-01

    The development, administration, and scoring of a communicative test and its validation with tests of linguistic and sociolinguistic competence in English and Spanish are reported. Correlation with measures of home language use and school achievement are also presented, and issues of test validation for bilingual programs are discussed. (MSE)

  14. Analytical validation of a psychiatric pharmacogenomic test.

    PubMed

    Jablonski, Michael R; King, Nina; Wang, Yongbao; Winner, Joel G; Watterson, Lucas R; Gunselman, Sandra; Dechairo, Bryan M

    2018-05-01

    The aim of this study was to validate the analytical performance of a combinatorial pharmacogenomics test designed to aid in the appropriate medication selection for neuropsychiatric conditions. Genomic DNA was isolated from buccal swabs. Twelve genes (65 variants/alleles) associated with psychotropic medication metabolism, side effects, and mechanisms of actions were evaluated by bead array, MALDI-TOF mass spectrometry, and/or capillary electrophoresis methods (GeneSight Psychotropic, Assurex Health, Inc.). The combinatorial pharmacogenomics test has a dynamic range of 2.5-20 ng/μl of input genomic DNA, with comparable performance for all assays included in the test. Both the precision and accuracy of the test were >99.9%, with individual gene components between 99.4 and 100%. This study demonstrates that the combinatorial pharmacogenomics test is robust and reproducible, making it suitable for clinical use.

  15. Finite Element Vibration Modeling and Experimental Validation for an Aircraft Engine Casing

    NASA Astrophysics Data System (ADS)

    Rabbitt, Christopher

    This thesis presents a procedure for the development and validation of a theoretical vibration model, applies this procedure to a pair of aircraft engine casings, and compares select parameters from experimental testing of those casings to those from a theoretical model using the Modal Assurance Criterion (MAC) and linear regression coefficients. A novel method of determining the optimal MAC between axisymmetric results is developed and employed. It is concluded that the dynamic finite element models developed as part of this research are fully capable of modelling the modal parameters within the frequency range of interest. Confidence intervals calculated in this research for correlation coefficients provide important information regarding the reliability of predictions, and it is recommended that these intervals be calculated for all comparable coefficients. The procedure outlined for aligning mode shapes around an axis of symmetry proved useful, and the results are promising for the development of further optimization techniques.

  16. Development and validation of a melanoma risk score based on pooled data from 16 case-control studies

    PubMed Central

    Davies, John R; Chang, Yu-mei; Bishop, D Timothy; Armstrong, Bruce K; Bataille, Veronique; Bergman, Wilma; Berwick, Marianne; Bracci, Paige M; Elwood, J Mark; Ernstoff, Marc S; Green, Adele; Gruis, Nelleke A; Holly, Elizabeth A; Ingvar, Christian; Kanetsky, Peter A; Karagas, Margaret R; Lee, Tim K; Le Marchand, Loïc; Mackie, Rona M; Olsson, Håkan; Østerlind, Anne; Rebbeck, Timothy R; Reich, Kristian; Sasieni, Peter; Siskind, Victor; Swerdlow, Anthony J; Titus, Linda; Zens, Michael S; Ziegler, Andreas; Gallagher, Richard P.; Barrett, Jennifer H; Newton-Bishop, Julia

    2015-01-01

    Background We report the development of a cutaneous melanoma risk algorithm based upon 7 factors; hair colour, skin type, family history, freckling, nevus count, number of large nevi and history of sunburn, intended to form the basis of a self-assessment webtool for the general public. Methods Predicted odds of melanoma were estimated by analysing a pooled dataset from 16 case-control studies using logistic random coefficients models. Risk categories were defined based on the distribution of the predicted odds in the controls from these studies. Imputation was used to estimate missing data in the pooled datasets. The 30th, 60th and 90th centiles were used to distribute individuals into four risk groups for their age, sex and geographic location. Cross-validation was used to test the robustness of the thresholds for each group by leaving out each study one by one. Performance of the model was assessed in an independent UK case-control study dataset. Results Cross-validation confirmed the robustness of the threshold estimates. Cases and controls were well discriminated in the independent dataset (area under the curve 0.75, 95% CI 0.73-0.78). 29% of cases were in the highest risk group compared with 7% of controls, and 43% of controls were in the lowest risk group compared with 13% of cases. Conclusion We have identified a composite score representing an estimate of relative risk and successfully validated this score in an independent dataset. Impact This score may be a useful tool to inform members of the public about their melanoma risk. PMID:25713022

  17. Measurement of Dietary Restraint: Validity Tests of Four Questionnaires

    PubMed Central

    Williamson, Donald A.; Martin, Corby K.; York-Crowe, Emily; Anton, Stephen D.; Redman, Leanne M.; Han, Hongmei; Ravussin, Eric

    2007-01-01

    This study tested the validity of four measures of dietary restraint: Dutch Eating Behavior Questionnaire, Eating Inventory (EI), Revised Restraint Scale (RS), and the Current Dieting Questionnaire. Dietary restraint has been implicated as a determinant of overeating and binge eating. Conflicting findings have been attributed to different methods for measuring dietary restraint. The validity of four self-report measures of dietary restraint and dieting behavior was tested using: 1) factor analysis, 2) changes in dietary restraint in a randomized controlled trial of different methods to achieve calorie restriction, and 3) correlation of changes in dietary restraint with an objective measure of energy balance, calculated from the changes in fat mass and fat-free mass over a six-month dietary intervention. Scores from all four questionnaires, measured at baseline, formed a dietary restraint factor, but the RS also loaded on a binge eating factor. Based on change scores, the EI Restraint scale was the only measure that correlated significantly with energy balance expressed as a percentage of energy require d for weight maintenance. These findings suggest that that, of the four questionnaires tested, the EI Restraint scale was the most valid measure of the intent to diet and actual caloric restriction. PMID:17101191

  18. Intratester Reliability and Construct Validity of a Hip Abductor Eccentric Strength Test.

    PubMed

    Brindle, Richard A; Ebaugh, David; Milner, Clare E

    2018-06-06

    Side-lying hip abductor strength tests are commonly used to evaluate muscle strength. In a "break" test, the tester applies sufficient force to lower the limb to the table while the patient resists. The peak force is postulated to occur while the leg is lowering, thus representing the participant's eccentric muscle strength. However, it is unclear whether peak force occurs before or after the leg begins to lower. To determine intrarater reliability and construct validity of a hip abductor eccentric strength test. Intrarater reliability and construct validity study. Twenty healthy adults (26 [6] y; 1.66 [0.06] m; 62.2 [8.0] kg) made 2 visits to the laboratory at least 1 week apart. During the hip abductor eccentric strength test, a handheld dynamometer recorded peak force and time to peak force, and limb position was recorded via a motion capture system. Intrarater reliability was determined using intraclass correlation, SEM, and minimal detectable difference. Construct validity was assessed by determining if peak force occurred after the start of the lowering phase using a 1-sample t test. The hip abductor eccentric strength test had substantial intrarater reliability (intraclass correlation (3,3)  = .88; 95% confidence interval, .65-.95), SEM of 0.9 %BWh, and a minimal detectable difference of 2.5 %BWh. Construct validity was established as peak force occurred 2.1 (0.6) seconds (range: 0.7-3.7 s) after the start of the lowering phase of the test (P ≤ .001). The hip abductor eccentric strength test is a valid and reliable measure of eccentric muscle strength. This test may be used clinically to assess changes in eccentric muscle strength over time.

  19. The Predictive Validity of the Metropolitan Readiness Tests, 1976 Edition.

    ERIC Educational Resources Information Center

    Nagle, Richard J.

    1979-01-01

    A sample of 176 first-grade children was tested on the Metropolitan Readiness Tests, 1976 Edition (MRT), during the initial month of school and was retested eight months later on the Stanford Achievement Test. Results demonstrated substantial validity of the MRT for predicting first-grade achievement. (Author/CTM)

  20. Supersonic Retropropulsion CFD Validation with Ames Unitary Plan Wind Tunnel Test Data

    NASA Technical Reports Server (NTRS)

    Schauerhamer, Daniel G.; Zarchi, Kerry A.; Kleb, William L.; Edquist, Karl T.

    2013-01-01

    A validation study of Computational Fluid Dynamics (CFD) for Supersonic Retropropulsion (SRP) was conducted using three Navier-Stokes flow solvers (DPLR, FUN3D, and OVERFLOW). The study compared results from the CFD codes to each other and also to wind tunnel test data obtained in the NASA Ames Research Center 90 70 Unitary PlanWind Tunnel. Comparisons include surface pressure coefficient as well as unsteady plume effects, and cover a range of Mach numbers, levels of thrust, and angles of orientation. The comparisons show promising capability of CFD to simulate SRP, and best agreement with the tunnel data exists for the steadier cases of the 1-nozzle and high thrust 3-nozzle configurations.

  1. Validation testing of a soil macronutrient sensing system

    USDA-ARS?s Scientific Manuscript database

    Rapid on-site measurements of soil macronutrients (i.e., nitrogen, phosphorus, and potassium) are needed for site-specific crop management, where fertilizer nutrient application rates are adjusted spatially based on local requirements. This study reports on validation testing of a previously develop...

  2. Validation of biological activity testing procedure of recombinant human interleukin-7.

    PubMed

    Lutsenko, T N; Kovalenko, M V; Galkin, O Yu

    2017-01-01

    Validation procedure for method of monitoring the biological activity of reсombinant human interleukin-7 has been developed and conducted according to the requirements of national and international recommendations. This method is based on the ability of recombinant human interleukin-7 to induce proliferation of T lymphocytes. It has been shown that to control the biological activity of recombinant human interleukin-7 peripheral blood mononuclear cells (PBMCs) derived from blood or cell lines can be used. Validation charac­teristics that should be determined depend on the method, type of product or object test/measurement and biological test systems used in research. The validation procedure for the method of control of biological activity of recombinant human interleukin-7 in peripheral blood mononuclear cells showed satisfactory results on all parameters tested such as specificity, accuracy, precision and linearity.

  3. Federal COBOL Compiler Testing Service Compiler Validation Request Information.

    DTIC Science & Technology

    1977-05-09

    background of the Federal COBOL Compiler Testing Service which was set up by a memorandum of agreement between the National Bureau of Standards and the...Federal Standard, and the requirement of COBOL compiler validation in the procurement process. It also contains a list of all software products...produced by the software Development Division in support of the FCCTS as well as the Validation Summary Reports produced as a result of discharging the

  4. Validity of the Eating Attitudes Test and the Eating Disorders Inventory in Bulimia Nervosa.

    ERIC Educational Resources Information Center

    Gross, Janet; And Others

    1986-01-01

    Assessed criterion and concurrent validity of the Eating Attitudes Test and the Eating Disorder Inventory in 82 women with bulimia nervosa. Both tests demonstrated criterion validity by discriminating bulimia nervosa subjects from normals. Only weak support was found for concurrent validity within bulimia subjects. Recommends combination of…

  5. Impact on Participation and Autonomy: Test of Validity and Reliability for Older Persons.

    PubMed

    Hammar, Isabelle Ottenvall; Ekelund, Christina; Wilhelmson, Katarina; Eklund, Kajsa

    2014-11-06

    In research and healthcare it is important to measure older persons' self-determination in order to improve their possibilities to decide for themselves in daily life. The questionnaire Impact on Participation and Autonomy (IPA) assesses self-determination, but is not constructed for older persons. The aim of this study was to examine the validity and reliability of the IPA-S questionnaire for persons aged 70 years and older. The study was performed in two steps; first a validity test of the Swedish version of the questionnaire, IPA-S, followed by a reliability test-retest of an adjusted version. The validity was tested with focus groups and individual interviews on persons aged 77-88 years, and the reliability on persons aged 70-99 years. The validity test result showed that IPA-S is valid for older persons but it was too extensive and the phrasing of the items needed adjustments. The reliability test-retest on the adjusted questionnaire, IPA- Older persons (IPA-O), showed that 15 of 22 items had high agreement. IPA-O can be used to measure older persons' self-determination in their care and rehabilitation.

  6. Validity and the Consequences of Test Interpretation and Use

    ERIC Educational Resources Information Center

    Hubley, Anita M.; Zumbo, Bruno D.

    2011-01-01

    The vast majority of measures have, at their core, a purpose of personal and social change. If test developers and users want measures to have personal and social consequences and impact, then it is critical to consider the consequences and side effects of measurement in the validation process itself. The consequential basis of test interpretation…

  7. An entropy-based nonparametric test for the validation of surrogate endpoints.

    PubMed

    Miao, Xiaopeng; Wang, Yong-Cheng; Gangopadhyay, Ashis

    2012-06-30

    We present a nonparametric test to validate surrogate endpoints based on measure of divergence and random permutation. This test is a proposal to directly verify the Prentice statistical definition of surrogacy. The test does not impose distributional assumptions on the endpoints, and it is robust to model misspecification. Our simulation study shows that the proposed nonparametric test outperforms the practical test of the Prentice criterion in terms of both robustness of size and power. We also evaluate the performance of three leading methods that attempt to quantify the effect of surrogate endpoints. The proposed method is applied to validate magnetic resonance imaging lesions as the surrogate endpoint for clinical relapses in a multiple sclerosis trial. Copyright © 2012 John Wiley & Sons, Ltd.

  8. Validation of the German version of the Ford Insomnia Response to Stress Test.

    PubMed

    Dieck, Arne; Helbig, Susanne; Drake, Christopher L; Backhaus, Jutta

    2018-06-01

    The purpose of this study was to assess the psychometric properties of a German version of the Ford Insomnia Response to Stress Test with groups with and without sleep problems. Three studies were analysed. Data set 1 was based on an initial screening for a sleep training program (n = 393), data set 2 was based on a study to test the test-retest reliability of the Ford Insomnia Response to Stress Test (n = 284) and data set 3 was based on a study to examine the influence of competitive sport on sleep (n = 37). Data sets 1 and 2 were used to test internal consistency, factor structure, convergent validity, discriminant validity and test-retest reliability of the Ford Insomnia Response to Stress Test. Content validity was tested using data set 3. Cronbach's alpha of the Ford Insomnia Response to Stress Test was good (α = 0.80) and test-retest reliability was satisfactory (r = 0.72). Overall, the one-factor model showed the best fit. Furthermore, significant positive correlations between the Ford Insomnia Response to Stress Test and impaired sleep quality, depression and stress reactivity were in line with the expectations regarding the convergent validity. Subjects with sleep problems had significantly higher scores in the Ford Insomnia Response to Stress Test than subjects without sleep problems (P < 0.01). Competitive athletes with higher scores in the Ford Insomnia Response to Stress Test had significantly lower sleep quality (P = 0.01), demonstrating that vulnerability for stress-induced sleep disturbances accompanies poorer sleep quality in stressful episodes. The findings show that the German version of the Ford Insomnia Response to Stress Test is a reliable and valid questionnaire to assess the vulnerability to stress-induced sleep disturbances. © 2017 European Sleep Research Society.

  9. Validity of a novel computerized screening test system for mild cognitive impairment.

    PubMed

    Park, Jin-Hyuck; Jung, Minye; Kim, Jongbae; Park, Hae Yean; Kim, Jung-Ran; Park, Ji-Hyuk

    2018-06-20

    ABSTRACTBackground:The mobile screening test system for screening mild cognitive impairment (mSTS-MCI) was developed for clinical use. However, the clinical usefulness of mSTS-MCI to detect elderly with MCI from those who are cognitively healthy has yet to be validated. Moreover, the comparability between this system and traditional screening tests for MCI has not been evaluated. The purpose of this study was to examine the validity and reliability of the mSTS-MCI and confirm the cut-off scores to detect MCI. The data were collected from 107 healthy elderly people and 74 elderly people with MCI. Concurrent validity was examined using the Korean version of Montreal Cognitive Assessment (MoCA-K) as a gold standard test, and test-retest reliability was investigated using 30 of the study participants at four-week intervals. The sensitivity, specificity, positive predictive value, and negative predictive value (NPV) were confirmed through Receiver Operating Characteristic (ROC) analysis, and the cut-off scores for elderly people with MCI were identified. Concurrent validity showed statistically significant correlations between the mSTS-MCI and MoCA-K and test-rests reliability indicated high correlation. As a result of screening predictability, the mSTS-MCI had a higher NPV than the MoCA-K. The mSTS-MCI was identified as a system with a high degree of validity and reliability. In addition, the mSTS-MCI showed high screening predictability, indicating it can be used in the clinical field as a screening test system for mild cognitive impairment.

  10. Validity of Selected Lab and Field Tests of Physical Working Capacity.

    ERIC Educational Resources Information Center

    Burke, Edmund J.

    The validity of selected lab and field tests of physical working capacity was investigated. Forty-four male college students were administered a series of lab and field tests of physical working capacity. Lab tests include a test of maximum oxygen uptake, the PWC 170 test, the Harvard Step Test, the Progressive Pulse Ratio Test, Margaria Test of…

  11. Successful MPPF Pneumatics Verification and Validation Testing

    NASA Image and Video Library

    2017-03-28

    Engineers and technicians completed verification and validation testing of several pneumatic systems inside and outside the Multi-Payload Processing Facility (MPPF) at NASA's Kennedy Space Center in Florida. In view is the top level of the service platform for Orion spacecraft processing. The MPPF will be used for offline processing and fueling of the Orion spacecraft and service module stack before launch. Orion also will be de-serviced in the MPPF after a mission. The Ground Systems Development and Operations Program (GSDO) is overseeing upgrades to the facility. The Engineering Directorate led the recent pneumatic tests.

  12. Successful MPPF Pneumatics Verification and Validation Testing

    NASA Image and Video Library

    2017-03-28

    Engineers and technicians completed verification and validation testing of several pneumatic systems inside and outside the Multi-Payload Processing Facility (MPPF) at NASA's Kennedy Space Center in Florida. In view is the service platform for Orion spacecraft processing. To the left are several pneumatic panels. The MPPF will be used for offline processing and fueling of the Orion spacecraft and service module stack before launch. Orion also will be de-serviced in the MPPF after a mission. The Ground Systems Development and Operations Program (GSDO) is overseeing upgrades to the facility. The Engineering Directorate led the recent pneumatic tests.

  13. Development and validation of an administrative case definition for inflammatory bowel diseases

    PubMed Central

    Rezaie, Ali; Quan, Hude; Fedorak, Richard N; Panaccione, Remo; Hilsden, Robert J

    2012-01-01

    BACKGROUND: A population-based database of inflammatory bowel disease (IBD) patients is invaluable to explore and monitor the epidemiology and outcome of the disease. In this context, an accurate and validated population-based case definition for IBD becomes critical for researchers and health care providers. METHODS: IBD and non-IBD individuals were identified through an endoscopy database in a western Canadian health region (Calgary Health Region, Calgary, Alberta). Subsequently, using a novel algorithm, a series of case definitions were developed to capture IBD cases in the administrative databases. In the second stage of the study, the criteria were validated in the Capital Health Region (Edmonton, Alberta). RESULTS: A total of 150 IBD case definitions were developed using 1399 IBD patients and 15,439 controls in the development phase. In the validation phase, 318,382 endoscopic procedures were searched and 5201 IBD patients were identified. After consideration of sensitivity, specificity and temporal stability of each validated case definition, a diagnosis of IBD was assigned to individuals who experienced at least two hospitalizations or had four physician claims, or two medical contacts in the Ambulatory Care Classification System database with an IBD diagnostic code within a two-year period (specificity 99.8%; sensitivity 83.4%; positive predictive value 97.4%; negative predictive value 98.5%). An alternative case definition was developed for regions without access to the Ambulatory Care Classification System database. A novel scoring system was developed that detected Crohn disease and ulcerative colitis patients with a specificity of >99% and a sensitivity of 99.1% and 86.3%, respectively. CONCLUSION: Through a robust methodology, a reproducible set of criteria to capture IBD patients through administrative databases was developed. The methodology may be used to develop similar administrative definitions for chronic diseases. PMID:23061064

  14. Validation of Global EO Biophysical Products at JECAM Test Site in Ukraine

    NASA Astrophysics Data System (ADS)

    Skakun, Sergii; Kussul, Nataliia; Kravchenko, Oleksiy; Basarab, Ruslan; Ostapenko, Vadym; Yailymov, Bohdan; Shelestov, Andrii; Kolotii, Andrii; Mironov, Andrii

    Efficient global agriculture monitoring requires appropriate validation of Earth observation (EO) products for different regions and cropping system. This problem is addressed within the Joint Experiment of Crop Assessment and Monitoring (JECAM) initiative which aims to develop monitoring and reporting protocols and best practices for a variety of global agricultural systems. Ukraine is actively involved into JECAM, and a JECAM Ukraine test site was officially established in 2011. The following problems are being solved within JECAM Ukraine: (i) crop identification and crop area estimation [1]; (ii) crop yield forecasting [2]; (iii) EO products validation. The following case study regions were selected for these purposes: (i) the whole Kyiv oblast (28,000 sq. km) indented for crop mapping and acreage estimation; (ii) intensive observation sub-site in Pshenichne which is a research farm from the National University of Life and Environmental Sciences of Ukraine and indented for crop biophysical parameters estimation; (iii) Lviv region for rape-seed identification and crop rotation control; (iv) Crimea region for crop damage assessment due to droughts, and illegial field detection. In 2013, Ukrainian JECAM test site was selected as one of the “Champion User” for the ESA Sentinel-2 for Agriculture project. The test site was observed with SPOT-4 and RapidEye satellites every 5 days. The collected images are then used to simulate Sentinel-2 images for agriculture purposes. JECAM Ukraine is responsible for collecting ground observation data for validation purposes, and is involved in providing user requirements for Sentinel-2 agriculture related products. In particular, three field campaigns to characterize the vegetation biophysical parameters at the Pshenichne test site were carried out: First campaign - 14th to 17th of May 2013; second campaign - 12th to 15th of June 2013; third campaign - 14th to 17th of July 2013. Digital Hemispheric Photographs (DHP) images were

  15. Validation of the Seating and Mobility Script Concordance Test

    ERIC Educational Resources Information Center

    Cohen, Laura J.; Fitzgerald, Shirley G.; Lane, Suzanne; Boninger, Michael L.; Minkel, Jean; McCue, Michael

    2009-01-01

    The purpose of this study was to develop the scoring system for the Seating and Mobility Script Concordance Test (SMSCT), obtain and appraise internal and external structure evidence, and assess the validity of the SMSCT. The SMSCT purpose is to provide a method for testing knowledge of seating and mobility prescription. A sample of 106 therapists…

  16. Validation in Support of Internationally Harmonised OECD Test Guidelines for Assessing the Safety of Chemicals.

    PubMed

    Gourmelon, Anne; Delrue, Nathalie

    Ten years elapsed since the OECD published the Guidance document on the validation and international regulatory acceptance of test methods for hazard assessment. Much experience has been gained since then in validation centres, in countries and at the OECD on a variety of test methods that were subjected to validation studies. This chapter reviews validation principles and highlights common features that appear to be important for further regulatory acceptance across studies. Existing OECD-agreed validation principles will most likely generally remain relevant and applicable to address challenges associated with the validation of future test methods. Some adaptations may be needed to take into account the level of technique introduced in test systems, but demonstration of relevance and reliability will continue to play a central role as pre-requisite for the regulatory acceptance. Demonstration of relevance will become more challenging for test methods that form part of a set of predictive tools and methods, and that do not stand alone. OECD is keen on ensuring that while these concepts evolve, countries can continue to rely on valid methods and harmonised approaches for an efficient testing and assessment of chemicals.

  17. Validity and test-retest reliability of the six-spot step test in persons after stroke.

    PubMed

    Arvidsson Lindvall, Mialinn; Anderzén-Carlsson, Agneta; Appelros, Peter; Forsberg, Anette

    2018-06-06

    After stroke, asymmetric weight distribution is common with decreased balance control in standing and walking. The six-spot step test (SSST) includes a 5-m walk during which one leg shoves wooden blocks out of circles marked on the floor, thus assessing the ability to take load on each leg. The aim of the present study was to investigate the convergent and discriminant validity and test-retest reliability of the SSST in persons with stroke. Eighty-one participants were included. A cross-sectional study was performed, in which the SSST was conducted twice, 3-7 days apart. Validity was investigated using measures of dynamic balance and walking. Reliability was assessed using intraclass correlation coefficient, standard error of the measurement (SEM), and smallest real difference (SRD). The convergent validity was strong to moderate, and the test-retest reliability was good. The SEM% was 14.7%, and the SRD% was 40.8% based on the mean of four walks shoving twice with the paretic and twice with the non-paretic leg. Values on random measurement error were high affecting the use of the SSST for follow-up evaluations but the SSST can be a complementary measure of gait and balance.

  18. [Comparison of the Wechsler Memory Scale-III and the Spain-Complutense Verbal Learning Test in acquired brain injury: construct validity and ecological validity].

    PubMed

    Luna-Lario, P; Pena, J; Ojeda, N

    2017-04-16

    To perform an in-depth examination of the construct validity and the ecological validity of the Wechsler Memory Scale-III (WMS-III) and the Spain-Complutense Verbal Learning Test (TAVEC). The sample consists of 106 adults with acquired brain injury who were treated in the Area of Neuropsychology and Neuropsychiatry of the Complejo Hospitalario de Navarra and displayed memory deficit as the main sequela, measured by means of specific memory tests. The construct validity is determined by examining the tasks required in each test over the basic theoretical models, comparing the performance according to the parameters offered by the tests, contrasting the severity indices of each test and analysing their convergence. The external validity is explored through the correlation between the tests and by using regression models. According to the results obtained, both the WMS-III and the TAVEC have construct validity. The TAVEC is more sensitive and captures not only the deficits in mnemonic consolidation, but also in the executive functions involved in memory. The working memory index of the WMS-III is useful for predicting the return to work at two years after the acquired brain injury, but none of the instruments anticipates the disability and dependence at least six months after the injury. We reflect upon the construct validity of the tests and their insufficient capacity to predict functionality when the sequelae become chronic.

  19. Reliability and factorial validity of flexibility tests for team sports.

    PubMed

    Sporis, Goran; Vucetic, Vlatko; Jovanovic, Mario; Jukic, Igor; Omrcen, Darija

    2011-04-01

    The main goal of this method paper was to evaluate the reliability and factorial validity of flexibility tests used in soccer, and to do crossvalidation study on 2 other team sports using handball and basketball players. The second aim was to compare the validity of the different tests and evaluate the flexibility of soccer players; the third was to determine the positional differences between attackers, defenders, and midfielders in all flexibility tests. One hundred and fifty (n = 150) elite male junior soccer players, members of the First Croatian Junior League Teams, and 60 (n = 60) handball and 60 (n = 60) basketball players also members of the First Croatian Junior League Teams volunteered to participate in the study, tested for the purpose of crossvalidation. The SAR and V-SAR had the greatest AVR and ICC. The within-subjects variation ranged from between 0.3 and 3.8%. The lowest value of CV was found between the LSPL and LSPR. Low to moderate statistically significant correlation coefficients were found among all the measured flexibility tests. It was observed that the greatest correlations existed between the SAR and V-SAR (r = 0.65) and between the LLSR and LLSL (r = 0.56). Statistically significant correlations were also observed between the BLPL and BLPR (r = 0.62). The principal components factor analysis of 9 flexibility tests resulted in the extraction of 3 significant components. The results of this study have the following implications for the assessment of flexibility in soccer: (a) all flexibility tests used in this study have the acceptable between and within-subjects reliability and they can be used to estimate the flexibility of soccer players; (b) the LSPL and LSPR tests are the most reliable and valid flexibility tests for the estimation of flexibility of professional soccer players.

  20. Development and Validation of a Food-Associated Olfactory Test (FAOT).

    PubMed

    Denzer-Lippmann, Melanie Yvonne; Beauchamp, Jonathan; Freiherr, Jessica; Thuerauf, Norbert; Kornhuber, Johannes; Buettner, Andrea

    2017-01-01

    Olfactory tests are an important tool in human nutritional research for studying food preferences, yet comprehensive tests dedicated solely to food odors are currently lacking. Therefore, within this study, an innovative food-associated olfactory test (FAOT) system was developed. The FAOT comprises 16 odorant pens that contain representative food odors relating to different macronutrient classes. The test underwent a sensory validation based on identification rate, intensity, hedonic value, and food association scores. The accuracy of the test was further compared to the accuracy of the established Sniffin' Sticks identification test. The identification rates and intensities of this new FAOT were found to be comparable to the Sniffin' Sticks olfactory identification test. The odorant pens were also assessed chemo-analytically and were found to be chemically stable for at least 24 weeks. Overall, this new identification test for use in assessing olfaction in a food-associated context is valid both in terms of its use in sensory perception studies and its chemical stability. The FOAT is particularly suited to examinations of the sense of smell regarding food odors. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  1. The Validation of a Case-Based, Cumulative Assessment and Progressions Examination

    PubMed Central

    Coker, Adeola O.; Copeland, Jeffrey T.; Gottlieb, Helmut B.; Horlen, Cheryl; Smith, Helen E.; Urteaga, Elizabeth M.; Ramsinghani, Sushma; Zertuche, Alejandra; Maize, David

    2016-01-01

    Objective. To assess content and criterion validity, as well as reliability of an internally developed, case-based, cumulative, high-stakes third-year Annual Student Assessment and Progression Examination (P3 ASAP Exam). Methods. Content validity was assessed through the writing-reviewing process. Criterion validity was assessed by comparing student scores on the P3 ASAP Exam with the nationally validated Pharmacy Curriculum Outcomes Assessment (PCOA). Reliability was assessed with psychometric analysis comparing student performance over four years. Results. The P3 ASAP Exam showed content validity through representation of didactic courses and professional outcomes. Similar scores on the P3 ASAP Exam and PCOA with Pearson correlation coefficient established criterion validity. Consistent student performance using Kuder-Richardson coefficient (KR-20) since 2012 reflected reliability of the examination. Conclusion. Pharmacy schools can implement internally developed, high-stakes, cumulative progression examinations that are valid and reliable using a robust writing-reviewing process and psychometric analyses. PMID:26941435

  2. Validity and power of association testing in family-based sampling designs: evidence for and against the common wisdom.

    PubMed

    Knight, Stacey; Camp, Nicola J

    2011-04-01

    Current common wisdom posits that association analyses using family-based designs have inflated type 1 error rates (if relationships are ignored) and independent controls are more powerful than familial controls. We explore these suppositions. We show theoretically that family-based designs can have deflated type-error rates. Through simulation, we examine the validity and power of family designs for several scenarios: cases from randomly or selectively ascertained pedigrees; and familial or independent controls. Family structures considered are as follows: sibships, nuclear families, moderate-sized and extended pedigrees. Three methods were considered with the χ(2) test for trend: variance correction (VC), weighted (weights assigned to account for genetic similarity), and naïve (ignoring relatedness) as well as the Modified Quasi-likelihood Score (MQLS) test. Selectively ascertained pedigrees had similar levels of disease enrichment; random ascertainment had no such restriction. Data for 1,000 cases and 1,000 controls were created under the null and alternate models. The VC and MQLS methods were always valid. The naïve method was anti-conservative if independent controls were used and valid or conservative in designs with familial controls. The weighted association method was generally valid for independent controls, and was conservative for familial controls. With regard to power, independent controls were more powerful for small-to-moderate selectively ascertained pedigrees, but familial and independent controls were equivalent in the extended pedigrees and familial controls were consistently more powerful for all randomly ascertained pedigrees. These results suggest a more complex situation than previously assumed, which has important implications for study design and analysis. © 2011 Wiley-Liss, Inc.

  3. Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers

    PubMed Central

    Reeves, Todd D.; Marbach-Ad, Gili

    2016-01-01

    Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology—either quantitative or qualitative—on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. PMID:26903498

  4. Meeting report: Validation of toxicogenomics-based test systems: ECVAM-ICCVAM/NICEATM considerations for regulatory use.

    PubMed

    Corvi, Raffaella; Ahr, Hans-Jürgen; Albertini, Silvio; Blakey, David H; Clerici, Libero; Coecke, Sandra; Douglas, George R; Gribaldo, Laura; Groten, John P; Haase, Bernd; Hamernik, Karen; Hartung, Thomas; Inoue, Tohru; Indans, Ian; Maurici, Daniela; Orphanides, George; Rembges, Diana; Sansone, Susanna-Assunta; Snape, Jason R; Toda, Eisaku; Tong, Weida; van Delft, Joost H; Weis, Brenda; Schechtman, Leonard M

    2006-03-01

    This is the report of the first workshop "Validation of Toxicogenomics-Based Test Systems" held 11-12 December 2003 in Ispra, Italy. The workshop was hosted by the European Centre for the Validation of Alternative Methods (ECVAM) and organized jointly by ECVAM, the U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM), and the National Toxicology Program (NTP) Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM). The primary aim of the workshop was for participants to discuss and define principles applicable to the validation of toxicogenomics platforms as well as validation of specific toxicologic test methods that incorporate toxicogenomics technologies. The workshop was viewed as an opportunity for initiating a dialogue between technologic experts, regulators, and the principal validation bodies and for identifying those factors to which the validation process would be applicable. It was felt that to do so now, as the technology is evolving and associated challenges are identified, would be a basis for the future validation of the technology when it reaches the appropriate stage. Because of the complexity of the issue, different aspects of the validation of toxicogenomics-based test methods were covered. The three focus areas include a) biologic validation of toxicogenomics-based test methods for regulatory decision making, b) technical and bioinformatics aspects related to validation, and c) validation issues as they relate to regulatory acceptance and use of toxicogenomics-based test methods. In this report we summarize the discussions and describe in detail the recommendations for future direction and priorities.

  5. Meeting Report: Validation of Toxicogenomics-Based Test Systems: ECVAM–ICCVAM/NICEATM Considerations for Regulatory Use

    PubMed Central

    Corvi, Raffaella; Ahr, Hans-Jürgen; Albertini, Silvio; Blakey, David H.; Clerici, Libero; Coecke, Sandra; Douglas, George R.; Gribaldo, Laura; Groten, John P.; Haase, Bernd; Hamernik, Karen; Hartung, Thomas; Inoue, Tohru; Indans, Ian; Maurici, Daniela; Orphanides, George; Rembges, Diana; Sansone, Susanna-Assunta; Snape, Jason R.; Toda, Eisaku; Tong, Weida; van Delft, Joost H.; Weis, Brenda; Schechtman, Leonard M.

    2006-01-01

    This is the report of the first workshop “Validation of Toxicogenomics-Based Test Systems” held 11–12 December 2003 in Ispra, Italy. The workshop was hosted by the European Centre for the Validation of Alternative Methods (ECVAM) and organized jointly by ECVAM, the U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM), and the National Toxicology Program (NTP) Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM). The primary aim of the workshop was for participants to discuss and define principles applicable to the validation of toxicogenomics platforms as well as validation of specific toxicologic test methods that incorporate toxicogenomics technologies. The workshop was viewed as an opportunity for initiating a dialogue between technologic experts, regulators, and the principal validation bodies and for identifying those factors to which the validation process would be applicable. It was felt that to do so now, as the technology is evolving and associated challenges are identified, would be a basis for the future validation of the technology when it reaches the appropriate stage. Because of the complexity of the issue, different aspects of the validation of toxicogenomics-based test methods were covered. The three focus areas include a) biologic validation of toxicogenomics-based test methods for regulatory decision making, b) technical and bioinformatics aspects related to validation, and c) validation issues as they relate to regulatory acceptance and use of toxicogenomics-based test methods. In this report we summarize the discussions and describe in detail the recommendations for future direction and priorities. PMID:16507466

  6. Application of validity theory and methodology to patient-reported outcome measures (PROMs): building an argument for validity.

    PubMed

    Hawkins, Melanie; Elsworth, Gerald R; Osborne, Richard H

    2018-07-01

    Data from subjective patient-reported outcome measures (PROMs) are now being used in the health sector to make or support decisions about individuals, groups and populations. Contemporary validity theorists define validity not as a statistical property of the test but as the extent to which empirical evidence supports the interpretation of test scores for an intended use. However, validity testing theory and methodology are rarely evident in the PROM validation literature. Application of this theory and methodology would provide structure for comprehensive validation planning to support improved PROM development and sound arguments for the validity of PROM score interpretation and use in each new context. This paper proposes the application of contemporary validity theory and methodology to PROM validity testing. The validity testing principles will be applied to a hypothetical case study with a focus on the interpretation and use of scores from a translated PROM that measures health literacy (the Health Literacy Questionnaire or HLQ). Although robust psychometric properties of a PROM are a pre-condition to its use, a PROM's validity lies in the sound argument that a network of empirical evidence supports the intended interpretation and use of PROM scores for decision making in a particular context. The health sector is yet to apply contemporary theory and methodology to PROM development and validation. The theoretical and methodological processes in this paper are offered as an advancement of the theory and practice of PROM validity testing in the health sector.

  7. Defining surgical criteria for empty nose syndrome: Validation of the office-based cotton test and clinical interpretability of the validated Empty Nose Syndrome 6-Item Questionnaire.

    PubMed

    Thamboo, Andrew; Velasquez, Nathalia; Habib, Al-Rahim R; Zarabanda, David; Paknezhad, Hassan; Nayak, Jayakar V

    2017-08-01

    The validated Empty Nose Syndrome 6-Item Questionnaire (ENS6Q) identifies empty nose syndrome (ENS) patients. The unvalidated cotton test assesses improvement in ENS-related symptoms. By first validating the cotton test using the ENS6Q, we define the minimal clinically important difference (MCID) score for the ENS6Q. Individual case-control study. Fifteen patients diagnosed with ENS and 18 controls with non-ENS sinonasal conditions underwent office cotton placement. Both groups completed ENS6Q testing in three conditions-precotton, cotton in situ, and postcotton-to measure the reproducibility of ENS6Q scoring. Participants also completed a five-item transition scale ranging from "much better" to "much worse" to rate subjective changes in nasal breathing with and without cotton placement. Mean changes for each transition point, and the ENS6Q MCID, were then calculated. In the precotton condition, significant differences (P < .001) in all ENS6Q questions between ENS and controls were noted. With cotton in situ, nearly all prior ENS6Q differences normalized between ENS and control patients. For ENS patients, the changes in the mean differences between the precotton and cotton in situ conditions compared to postcotton versus cotton in situ conditions were insignificant among individuals. Including all 33 participants, the mean change in the ENS6Q between the parameters "a little better" and "about the same" was 4.25 (standard deviation [SD] = 5.79) and -2.00 (SD = 3.70), giving an MCID of 6.25. Cotton testing is a validated office test to assess for ENS patients. Cotton testing also helped to determine the MCID of the ENS6Q, which is a 7-point change from the baseline ENS6Q score. 3b. Laryngoscope, 127:1746-1752, 2017. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.

  8. Validation and Verification (V and V) Testing on Midscale Flame Resistant (FR) Test Method

    DTIC Science & Technology

    2016-12-16

    Method for Evaluation of Flame Resistant Clothing for Protection against Fire Simulations Using an Instrumented Manikin. Validation and...complement (not replace) the capabilities of the ASTM F1930 Standard Test Method for Evaluation of Flame Resistant Clothing for Protection against Fire ...Engineering Center (NSRDEC) to complement the ASTM F1930 Standard Test Method for Evaluation of Flame Resistant Clothing for Protection against Fire

  9. Validity of the American Sign Language Discrimination Test

    ERIC Educational Resources Information Center

    Bochner, Joseph H.; Samar, Vincent J.; Hauser, Peter C.; Garrison, Wayne M.; Searls, J. Matt; Sanders, Cynthia A.

    2016-01-01

    American Sign Language (ASL) is one of the most commonly taught languages in North America. Yet, few assessment instruments for ASL proficiency have been developed, none of which have adequately demonstrated validity. We propose that the American Sign Language Discrimination Test (ASL-DT), a recently developed measure of learners' ability to…

  10. Functional performance testing of the hip in athletes: a systematic review for reliability and validity.

    PubMed

    Kivlan, Benjamin R; Martin, Robroy L

    2012-08-01

    The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. 2b (Systematic Review of Literature).

  11. The script concordance test in radiation oncology: validation study of a new tool to assess clinical reasoning

    PubMed Central

    Lambert, Carole; Gagnon, Robert; Nguyen, David; Charlin, Bernard

    2009-01-01

    Background The Script Concordance test (SCT) is a reliable and valid tool to evaluate clinical reasoning in complex situations where experts' opinions may be divided. Scores reflect the degree of concordance between the performance of examinees and that of a reference panel of experienced physicians. The purpose of this study is to demonstrate SCT's usefulness in radiation oncology. Methods A 90 items radiation oncology SCT was administered to 155 participants. Three levels of experience were tested: medical students (n = 70), radiation oncology residents (n = 38) and radiation oncologists (n = 47). Statistical tests were performed to assess reliability and to document validity. Results After item optimization, the test comprised 30 cases and 70 questions. Cronbach alpha was 0.90. Mean scores were 51.62 (± 8.19) for students, 71.20 (± 9.45) for residents and 76.67 (± 6.14) for radiation oncologists. The difference between the three groups was statistically significant when compared by the Kruskall-Wallis test (p < 0.001). Conclusion The SCT is reliable and useful to discriminate among participants according to their level of experience in radiation oncology. It appears as a useful tool to document the progression of reasoning during residency training. PMID:19203358

  12. Testing for Factorial Invariance in the Context of Construct Validation

    ERIC Educational Resources Information Center

    Dimitrov, Dimiter M.

    2010-01-01

    This article describes the logic and procedures behind testing for factorial invariance across groups in the context of construct validation. The procedures include testing for configural, measurement, and structural invariance in the framework of multiple-group confirmatory factor analysis (CFA). The "forward" (sequential constraint imposition)…

  13. Code-based Diagnostic Algorithms for Idiopathic Pulmonary Fibrosis. Case Validation and Improvement.

    PubMed

    Ley, Brett; Urbania, Thomas; Husson, Gail; Vittinghoff, Eric; Brush, David R; Eisner, Mark D; Iribarren, Carlos; Collard, Harold R

    2017-06-01

    Population-based studies of idiopathic pulmonary fibrosis (IPF) in the United States have been limited by reliance on diagnostic code-based algorithms that lack clinical validation. To validate a well-accepted International Classification of Diseases, Ninth Revision, code-based algorithm for IPF using patient-level information and to develop a modified algorithm for IPF with enhanced predictive value. The traditional IPF algorithm was used to identify potential cases of IPF in the Kaiser Permanente Northern California adult population from 2000 to 2014. Incidence and prevalence were determined overall and by age, sex, and race/ethnicity. A validation subset of cases (n = 150) underwent expert medical record and chest computed tomography review. A modified IPF algorithm was then derived and validated to optimize positive predictive value. From 2000 to 2014, the traditional IPF algorithm identified 2,608 cases among 5,389,627 at-risk adults in the Kaiser Permanente Northern California population. Annual incidence was 6.8/100,000 person-years (95% confidence interval [CI], 6.1-7.7) and was higher in patients with older age, male sex, and white race. The positive predictive value of the IPF algorithm was only 42.2% (95% CI, 30.6 to 54.6%); sensitivity was 55.6% (95% CI, 21.2 to 86.3%). The corrected incidence was estimated at 5.6/100,000 person-years (95% CI, 2.6-10.3). A modified IPF algorithm had improved positive predictive value but reduced sensitivity compared with the traditional algorithm. A well-accepted International Classification of Diseases, Ninth Revision, code-based IPF algorithm performs poorly, falsely classifying many non-IPF cases as IPF and missing a substantial proportion of IPF cases. A modification of the IPF algorithm may be useful for future population-based studies of IPF.

  14. The Construct Validation of Tests of Communicative Competence.

    ERIC Educational Resources Information Center

    Palmer, Adrian S., Ed.; And Others

    This collection, including the proceedings of a colloquium at TESOL 1979, includes the following papers: (1) "Classification of Oral Proficiency Tests," by H. Madsen and R. Jones; (2) "A Theoretical Framework for Communicative Competence," by M. Canale and M. Swain; (3) "Beyond Faith and Face Validity: The Multitrait-Multimethod Matrix and the…

  15. Validation of asthma recording in electronic health records: a systematic review

    PubMed Central

    Nissen, Francis; Quint, Jennifer K; Wilkinson, Samantha; Mullerova, Hana; Smeeth, Liam; Douglas, Ian J

    2017-01-01

    Objective To describe the methods used to validate asthma diagnoses in electronic health records and summarize the results of the validation studies. Background Electronic health records are increasingly being used for research on asthma to inform health services and health policy. Validation of the recording of asthma diagnoses in electronic health records is essential to use these databases for credible epidemiological asthma research. Methods We searched EMBASE and MEDLINE databases for studies that validated asthma diagnoses detected in electronic health records up to October 2016. Two reviewers independently assessed the full text against the predetermined inclusion criteria. Key data including author, year, data source, case definitions, reference standard, and validation statistics (including sensitivity, specificity, positive predictive value [PPV], and negative predictive value [NPV]) were summarized in two tables. Results Thirteen studies met the inclusion criteria. Most studies demonstrated a high validity using at least one case definition (PPV >80%). Ten studies used a manual validation as the reference standard; each had at least one case definition with a PPV of at least 63%, up to 100%. We also found two studies using a second independent database to validate asthma diagnoses. The PPVs of the best performing case definitions ranged from 46% to 58%. We found one study which used a questionnaire as the reference standard to validate a database case definition; the PPV of the case definition algorithm in this study was 89%. Conclusion Attaining high PPVs (>80%) is possible using each of the discussed validation methods. Identifying asthma cases in electronic health records is possible with high sensitivity, specificity or PPV, by combining multiple data sources, or by focusing on specific test measures. Studies testing a range of case definitions show wide variation in the validity of each definition, suggesting this may be important for obtaining

  16. Assessing reliability and validity measures in managed care studies.

    PubMed

    Montoya, Isaac D

    2003-01-01

    To review the reliability and validity literature and develop an understanding of these concepts as applied to managed care studies. Reliability is a test of how well an instrument measures the same input at varying times and under varying conditions. Validity is a test of how accurately an instrument measures what one believes is being measured. A review of reliability and validity instructional material was conducted. Studies of managed care practices and programs abound. However, many of these studies utilize measurement instruments that were developed for other purposes or for a population other than the one being sampled. In other cases, instruments have been developed without any testing of the instrument's performance. The lack of reliability and validity information may limit the value of these studies. This is particularly true when data are collected for one purpose and used for another. The usefulness of certain studies without reliability and validity measures is questionable, especially in cases where the literature contradicts itself

  17. The validity and reliability of a dynamic neuromuscular stabilization-heel sliding test for core stability.

    PubMed

    Cha, Young Joo; Lee, Jae Jin; Kim, Do Hyun; You, Joshua Sung H

    2017-10-23

    Core stabilization plays an important role in the regulation of postural stability. To overcome shortcomings associated with pain and severe core instability during conventional core stabilization tests, we recently developed the dynamic neuromuscular stabilization-based heel sliding (DNS-HS) test. The purpose of this study was to establish the criterion validity and test-retest reliability of the novel DNS-HS test. Twenty young adults with core instability completed both the bilateral straight leg lowering test (BSLLT) and DNS-HS test for the criterion validity study and repeated the DNS-HS test for the test-retest reliability study. Criterion validity was determined by comparing hip joint angle data that were obtained from BSLLT and DNS-HS measures. The test-retest reliability was determined by comparing hip joint angle data. Criterion validity was (ICC2,3) = 0.700 (p< 0.05), suggesting a good relationship between the two core stability measures. Test-retest reliability was (ICC3,3) = 0.953 (p< 0.05), indicating excellent consistency between the repeated DNS-HS measurements. Criterion validity data demonstrated a good relationship between the gold standard BSLLT and DNS-HS core stability measures. Test-retest reliability data suggests that DNS-HS core stability was a reliable test for core stability. Clinically, the DNS-HS test is useful to objectively quantify core instability and allow early detection and evaluation.

  18. 78 FR 20695 - Walk-Through Metal Detectors and Hand-Held Metal Detectors Test Method Validation

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-04-05

    ... Detectors and Hand-Held Metal Detectors Test Method Validation AGENCY: National Institute of Justice, DOJ... ensure that the test methods in the standards are properly documented, NIJ is requesting proposals (including price quotes) for test method validation efforts from testing laboratories. NIJ is also seeking...

  19. Validity of the modified back-saver sit-and-reach test: a comparison with other protocols.

    PubMed

    Hui, S S; Yuen, P Y

    2000-09-01

    Studies have shown that the classical sit-and-reach (CSR) test, the modified sit-and-reach (MSR), and the newly developed back-saver sit-and-reach (BS) test have poor criterion-related validity in estimating low-back flexibility but yielded moderate criterion-related validity in hamstring flexibility. The V sit-and-reach (VSR) test was found to be practical but the validity has not been established. The purpose of this study was to propose a modified back-saver sit-and-reach (MBS) test, which incorporated all advantages of the various protocols, and to compare the criterion-related validity and reliability of all these tests. 158 college students (F = 96, and M = 62; age = 20.77 +/- 2.51) performed CSR, VSR, BS (left and right leg), and MBS (left and right leg) tests in a randomized order. Scores from each test were then correlated with the criterion measures. For all sit-reach tests, intraclass reliability (single trial) was very high (r = 0.89-0.98). MBS yielded significant and highest r with low-back and hamstring criterion for men (r = 0.47-0.67) and women (r = 0.23-0.54). The low-back and right hamstring validity of MBS for men were significantly (P < 0.01) higher than those from BS and CSR, whereas no differences in criterion-related validity were found between the MBS and other protocols in women. The ratings of perceived comfort among the sit-and-reach protocols were significantly different (P < 0.001) from each other. The rating for MBS was observed the most comfortable test as compared with other protocols. The MBS test is not only a reliable test for hamstring and low-back flexibility, it is also a more practical with improved validity for hamstring and low-back flexibility in men than previous protocols.

  20. Development, construct validity and test-retest reliability of a field-based wheelchair mobility performance test for wheelchair basketball.

    PubMed

    de Witte, Annemarie M H; Hoozemans, Marco J M; Berger, Monique A M; van der Slikke, Rienk M A; van der Woude, Lucas H V; Veeger, Dirkjan H E J

    2018-01-01

    The aim of this study was to develop and describe a wheelchair mobility performance test in wheelchair basketball and to assess its construct validity and reliability. To mimic mobility performance of wheelchair basketball matches in a standardised manner, a test was designed based on observation of wheelchair basketball matches and expert judgement. Forty-six players performed the test to determine its validity and 23 players performed the test twice for reliability. Independent-samples t-tests were used to assess whether the times needed to complete the test were different for classifications, playing standards and sex. Intraclass correlation coefficients (ICC) were calculated to quantify reliability of performance times. Males performed better than females (P < 0.001, effect size [ES] = -1.26) and international men performed better than national men (P < 0.001, ES = -1.62). Performance time of low (≤2.5) and high (≥3.0) classification players was borderline not significant with a moderate ES (P = 0.06, ES = 0.58). The reliability was excellent for overall performance time (ICC = 0.95). These results show that the test can be used as a standardised mobility performance test to validly and reliably assess the capacity in mobility performance of elite wheelchair basketball athletes. Furthermore, the described methodology of development is recommended for use in other sports to develop sport-specific tests.

  1. The Unified Language Testing Plan: Speaking Proficiency Test. Spanish and English Pilot Validation Studies. Report Number 1.

    ERIC Educational Resources Information Center

    Thornton, Julie A.

    This report describes one segment of the Federal Language Testing Board's Unified Language Testing Plan (ULTP), the validation of speaking proficiency tests in Spanish and English. The ULTP is a project to increase standardization of foreign language proficiency measurement and promote sharing of resources among testing programs in the federal…

  2. Screening for cognitive impairment in older individuals. Validation study of a computer-based test.

    PubMed

    Green, R C; Green, J; Harrison, J M; Kutner, M H

    1994-08-01

    This study examined the validity of a computer-based cognitive test that was recently designed to screen the elderly for cognitive impairment. Criterion-related validity was examined by comparing test scores of impaired patients and normal control subjects. Construct-related validity was computed through correlations between computer-based subtests and related conventional neuropsychological subtests. University center for memory disorders. Fifty-two patients with mild cognitive impairment by strict clinical criteria and 50 unimpaired, age- and education-matched control subjects. Control subjects were rigorously screened by neurological, neuropsychological, imaging, and electrophysiological criteria to identify and exclude individuals with occult abnormalities. Using a cut-off total score of 126, this computer-based instrument had a sensitivity of 0.83 and a specificity of 0.96. Using a prevalence estimate of 10%, predictive values, positive and negative, were 0.70 and 0.96, respectively. Computer-based subtests correlated significantly with conventional neuropsychological tests measuring similar cognitive domains. Thirteen (17.8%) of 73 volunteers with normal medical histories were excluded from the control group, with unsuspected abnormalities on standard neuropsychological tests, electroencephalograms, or magnetic resonance imaging scans. Computer-based testing is a valid screening methodology for the detection of mild cognitive impairment in the elderly, although this particular test has important limitations. Broader applications of computer-based testing will require extensive population-based validation. Future studies should recognize that normal control subjects without a history of disease who are typically used in validation studies may have a high incidence of unsuspected abnormalities on neurodiagnostic studies.

  3. FUNCTIONAL PERFORMANCE TESTING OF THE HIP IN ATHLETES: A SYSTEMATIC REVIEW FOR RELIABILITY AND VALIDITY

    PubMed Central

    Martin, RobRoy L.

    2012-01-01

    Purpose/Background: The purpose of this study was to systematically review the literature for functional performance tests with evidence of reliability and validity that could be used for a young, athletic population with hip dysfunction. Methods: A search of PubMed and SPORTDiscus databases were performed to identify movement, balance, hop/jump, or agility functional performance tests from the current peer-reviewed literature used to assess function of the hip in young, athletic subjects. Results: The single-leg stance, deep squat, single-leg squat, and star excursion balance tests (SEBT) demonstrated evidence of validity and normative data for score interpretation. The single-leg stance test and SEBT have evidence of validity with association to hip abductor function. The deep squat test demonstrated evidence as a functional performance test for evaluating femoroacetabular impingement. Hop/Jump tests and agility tests have no reported evidence of reliability or validity in a population of subjects with hip pathology. Conclusions: Use of functional performance tests in the assessment of hip dysfunction has not been well established in the current literature. Diminished squat depth and provocation of pain during the single-leg balance test have been associated with patients diagnosed with FAI and gluteal tendinopathy, respectively. The SEBT and single-leg squat tests provided evidence of convergent validity through an analysis of kinematics and muscle function in normal subjects. Reliability of functional performance tests have not been established on patients with hip dysfunction. Further study is needed to establish reliability and validity of functional performance tests that can be used in a young, athletic population with hip dysfunction. Level of Evidence: 2b (Systematic Review of Literature) PMID:22893860

  4. Experimental validation of a new heterogeneous mechanical test design

    NASA Astrophysics Data System (ADS)

    Aquino, J.; Campos, A. Andrade; Souto, N.; Thuillier, S.

    2018-05-01

    Standard material parameters identification strategies generally use an extensive number of classical tests for collecting the required experimental data. However, a great effort has been made recently by the scientific and industrial communities to support this experimental database on heterogeneous tests. These tests can provide richer information on the material behavior allowing the identification of a more complete set of material parameters. This is a result of the recent development of full-field measurements techniques, like digital image correlation (DIC), that can capture the heterogeneous deformation fields on the specimen surface during the test. Recently, new specimen geometries were designed to enhance the richness of the strain field and capture supplementary strain states. The butterfly specimen is an example of these new geometries, designed through a numerical optimization procedure where an indicator capable of evaluating the heterogeneity and the richness of strain information. However, no experimental validation was yet performed. The aim of this work is to experimentally validate the heterogeneous butterfly mechanical test in the parameter identification framework. For this aim, DIC technique and a Finite Element Model Up-date inverse strategy are used together for the parameter identification of a DC04 steel, as well as the calculation of the indicator. The experimental tests are carried out in a universal testing machine with the ARAMIS measuring system to provide the strain states on the specimen surface. The identification strategy is accomplished with the data obtained from the experimental tests and the results are compared to a reference numerical solution.

  5. Defense of Tests Prevents Objective Consideration of Validity and Fairness

    ERIC Educational Resources Information Center

    Helms, Janet E.

    2009-01-01

    In defending tests of cognitive abilities, knowledge, or skills (CAKS) from the skepticism of their "family members, friends, and neighbors" and aiding psychologists forced to defend tests from "myth and hearsay" in their own skeptical social networks (p. 215), Sackett, Borneman, and Connelly focused on evaluating validity coefficients, racial or…

  6. Analyzing self-controlled case series data when case confirmation rates are estimated from an internal validation sample.

    PubMed

    Xu, Stanley; Clarke, Christina L; Newcomer, Sophia R; Daley, Matthew F; Glanz, Jason M

    2018-05-16

    Vaccine safety studies are often electronic health record (EHR)-based observational studies. These studies often face significant methodological challenges, including confounding and misclassification of adverse event. Vaccine safety researchers use self-controlled case series (SCCS) study design to handle confounding effect and employ medical chart review to ascertain cases that are identified using EHR data. However, for common adverse events, limited resources often make it impossible to adjudicate all adverse events observed in electronic data. In this paper, we considered four approaches for analyzing SCCS data with confirmation rates estimated from an internal validation sample: (1) observed cases, (2) confirmed cases only, (3) known confirmation rate, and (4) multiple imputation (MI). We conducted a simulation study to evaluate these four approaches using type I error rates, percent bias, and empirical power. Our simulation results suggest that when misclassification of adverse events is present, approaches such as observed cases, confirmed case only, and known confirmation rate may inflate the type I error, yield biased point estimates, and affect statistical power. The multiple imputation approach considers the uncertainty of estimated confirmation rates from an internal validation sample, yields a proper type I error rate, largely unbiased point estimate, proper variance estimate, and statistical power. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Exploring the Reliability and Validity of the Social-Moral Awareness Test

    ERIC Educational Resources Information Center

    Livesey, Alexandra; Dodd, Karen; Pote, Helen; Marlow, Elizabeth

    2012-01-01

    Background: The aim of the study was to explore the validity of the social-moral awareness test (SMAT) a measure designed for assessing socio-moral rule knowledge and reasoning in people with learning disabilities. Comparisons between Theory of Mind and socio-moral reasoning allowed the exploration of construct validity of the tool. Factor…

  8. Phase 1 Validation Testing and Simulation for the WEC-Sim Open Source Code

    NASA Astrophysics Data System (ADS)

    Ruehl, K.; Michelen, C.; Gunawan, B.; Bosma, B.; Simmons, A.; Lomonaco, P.

    2015-12-01

    WEC-Sim is an open source code to model wave energy converters performance in operational waves, developed by Sandia and NREL and funded by the US DOE. The code is a time-domain modeling tool developed in MATLAB/SIMULINK using the multibody dynamics solver SimMechanics, and solves the WEC's governing equations of motion using the Cummins time-domain impulse response formulation in 6 degrees of freedom. The WEC-Sim code has undergone verification through code-to-code comparisons; however validation of the code has been limited to publicly available experimental data sets. While these data sets provide preliminary code validation, the experimental tests were not explicitly designed for code validation, and as a result are limited in their ability to validate the full functionality of the WEC-Sim code. Therefore, dedicated physical model tests for WEC-Sim validation have been performed. This presentation provides an overview of the WEC-Sim validation experimental wave tank tests performed at the Oregon State University's Directional Wave Basin at Hinsdale Wave Research Laboratory. Phase 1 of experimental testing was focused on device characterization and completed in Fall 2015. Phase 2 is focused on WEC performance and scheduled for Winter 2015/2016. These experimental tests were designed explicitly to validate the performance of WEC-Sim code, and its new feature additions. Upon completion, the WEC-Sim validation data set will be made publicly available to the wave energy community. For the physical model test, a controllable model of a floating wave energy converter has been designed and constructed. The instrumentation includes state-of-the-art devices to measure pressure fields, motions in 6 DOF, multi-axial load cells, torque transducers, position transducers, and encoders. The model also incorporates a fully programmable Power-Take-Off system which can be used to generate or absorb wave energy. Numerical simulations of the experiments using WEC-Sim will be

  9. Traceability validation of a high speed short-pulse testing method used in LED production

    NASA Astrophysics Data System (ADS)

    Revtova, Elena; Vuelban, Edgar Moreno; Zhao, Dongsheng; Brenkman, Jacques; Ulden, Henk

    2017-12-01

    Industrial processes of LED (light-emitting diode) production include LED light output performance testing. Most of them are monitored and controlled by optically, electrically and thermally measuring LEDs by high speed short-pulse measurement methods. However, these are not standardized and a lot of information is proprietary that it is impossible for third parties, such as NMIs, to trace and validate. It is known, that these techniques have traceability issue and metrological inadequacies. Often due to these, the claimed performance specifications of LEDs are overstated, which consequently results to manufacturers experiencing customers' dissatisfaction and a large percentage of failures in daily use of LEDs. In this research a traceable setup is developed to validate one of the high speed testing techniques, investigate inadequacies and work out the traceability issues. A well-characterised short square pulse of 25 ms is applied to chip-on-board (CoB) LED modules to investigate the light output and colour content. We conclude that the short-pulse method is very efficient in case a well-defined electrical current pulse is applied and the stabilization time of the device is "a priori" accurately determined. No colour shift is observed. The largest contributors to the measurement uncertainty include badly-defined current pulse and inaccurate calibration factor.

  10. Validation and Simulation of Ares I Scale Model Acoustic Test - 3 - Modeling and Evaluating the Effect of Rainbird Water Deluge Inclusion

    NASA Technical Reports Server (NTRS)

    Strutzenberg, Louise L.; Putman, Gabriel C.

    2011-01-01

    The Ares I Scale Model Acoustics Test (ASMAT) is a series of live-fire tests of scaled rocket motors meant to simulate the conditions of the Ares I launch configuration. These tests have provided a well documented set of high fidelity measurements useful for validation including data taken over a range of test conditions and containing phenomena like Ignition Over-Pressure and water suppression of acoustics. Building on dry simulations of the ASMAT tests with the vehicle at 5 ft. elevation (100 ft. real vehicle elevation), wet simulations of the ASMAT test setup have been performed using the Loci/CHEM computational fluid dynamics software to explore the effect of rainbird water suppression inclusion on the launch platform deck. Two-phase water simulation has been performed using an energy and mass coupled lagrangian particle system module where liquid phase emissions are segregated into clouds of virtual particles and gas phase mass transfer is accomplished through simple Weber number controlled breakup and boiling models. Comparisons have been performed to the dry 5 ft. elevation cases, using configurations with and without launch mounts. These cases have been used to explore the interaction between rainbird spray patterns and launch mount geometry and evaluate the acoustic sound pressure level knockdown achieved through above-deck rainbird deluge inclusion. This comparison has been anchored with validation from live-fire test data which showed a reduction in rainbird effectiveness with the presence of a launch mount.

  11. A FIELD VALIDATION OF TWO SEDIMENT-AMPHIPOD TOXICITY TESTS

    EPA Science Inventory

    A field validation study of two sediment-amphipod toxicity tests was conducted using sediment samples collected subtidally in the vicinity of a polycyclic aromatic hydrocarbon (PAH)-contaminated Superfund site in Elliott Bay, WA, USA. Sediment samples were collected at 30 stati...

  12. Validity, Reliability, and Sensitivity of a Volleyball Intermittent Endurance Test.

    PubMed

    Rodríguez-Marroyo, Jose A; Medina-Carrillo, Javier; García-López, Juan; Morante, Juan C; Villa, José G; Foster, Carl

    2017-03-01

    To analyze the concurrent and construct validity of a volleyball intermittent endurance test (VIET). The VIET's test-retest reliability and sensitivity to assess seasonal changes was also studied. During the preseason, 71 volleyball players of different competitive levels took part in this study. All performed the VIET and a graded treadmill test with gas-exchange measurement (GXT). Thirty-one of the players performed an additional VIET to analyze the test-retest reliability. To test the VIET's sensitivity, 28 players repeated the VIET and GXT at the end of their season. Significant (P < .001) relationships between VIET distance and maximal oxygen uptake (r = .74) and GXT maximal speed (r = .78) were observed. There were no significant differences between the VIET performance test and retest (1542.1 ± 338.1 vs 1567.1 ± 358.2 m). Significant (P < .001) relationships and intraclass correlation coefficient (ICC) were found (r = .95, ICC = .96) for VIET performance. VIET performance increased significantly (P < .001) with player performance level and was sensitive to fitness changes across the season (1458.8 ± 343.5 vs 1581.1 ± 334.0 m, P < .01). The VIET may be considered a valid, reliable, and sensitive test to assess the aerobic endurance in volleyball players.

  13. Actual curriculum development practices instrument: Testing for factorial validity

    NASA Astrophysics Data System (ADS)

    Foi, Liew Yon; Bakar, Kamariah Abu; Hamzah, Mohd Sahandri Gani; Alwi, Nor Hayati

    2014-09-01

    The Actual Curriculum Development Practices Instrument (ACDP-I) was developed and the factorial validity of the ACDP-I was tested (n = 107) using exploratory factor analysis procedures in the earlier work of [1]. Despite the ACDP-I appears to be content and construct valid instrument with very high internal reliability qualities for using in Malaysia, the accumulated evidences are still needed to provide a sound scientific basis for the proposed score interpretations. Therefore, the present study addresses this concern by utilising the confirmatory factor analysis to further confirm the theoretical structure of the variable Actual Curriculum Development Practices (ACDP) and enrich the psychometrical properties of ACDP-I. Results of this study have practical implication to both researchers and educators whose concerns focus on teachers' classroom practices and the instrument development and validation process.

  14. Test of Creative Imagination: Validity and Reliability Study

    ERIC Educational Resources Information Center

    Gundogan, Aysun; Ari, Meziyet; Gonen, Mubeccel

    2013-01-01

    The purpose of this study was to investigate validity and reliability of the test of creative imagination. This study was conducted with the participation of 1000 children, aged between 9-14 and were studying in six primary schools in the city center of Denizli Province, chosen by cluster ratio sampling. In the study, it was revealed that the…

  15. Development and Validation of Economics Achievement Test for Secondary Schools

    ERIC Educational Resources Information Center

    Eleje, Lydia Ijeoma; Abanobi, Chidiebere Christopher; Obasi, Emma

    2017-01-01

    Economics achievement test (EAT) for assessing senior secondary two (SS2) achievement in economics was developed and validated in the study. Five research questions guided the study. Twenty and 100 mid-senior secondary (SS2) economics students was used for the pilot testing and reliability check respectively. A sample of 250 students randomly…

  16. Why Lessons Learned from the Past Require Haertel's Expanded Scope for Test Validation

    ERIC Educational Resources Information Center

    Shepard, Lorrie A.

    2013-01-01

    In his article, Haertel (this issue) asks a fundamental question about how use of a test is expected to cause improvements in the educational system and in learning. He also considers how test validity should be investigated and argues for a more expansive view of validity that does not stop with scoring or generalization (the more technical and…

  17. Voices from Test-Takers: Further Evidence for Language Assessment Validation and Use

    ERIC Educational Resources Information Center

    Cheng, Liying; DeLuca, Christopher

    2011-01-01

    Test-takers' interpretations of validity as related to test constructs and test use have been widely debated in large-scale language assessment. This study contributes further evidence to this debate by examining 59 test-takers' perspectives in writing large-scale English language tests. Participants wrote about their test-taking experiences in…

  18. Validation of 2D flood models with insurance claims

    NASA Astrophysics Data System (ADS)

    Zischg, Andreas Paul; Mosimann, Markus; Bernet, Daniel Benjamin; Röthlisberger, Veronika

    2018-02-01

    Flood impact modelling requires reliable models for the simulation of flood processes. In recent years, flood inundation models have been remarkably improved and widely used for flood hazard simulation, flood exposure and loss analyses. In this study, we validate a 2D inundation model for the purpose of flood exposure analysis at the river reach scale. We validate the BASEMENT simulation model with insurance claims using conventional validation metrics. The flood model is established on the basis of available topographic data in a high spatial resolution for four test cases. The validation metrics were calculated with two different datasets; a dataset of event documentations reporting flooded areas and a dataset of insurance claims. The model fit relating to insurance claims is in three out of four test cases slightly lower than the model fit computed on the basis of the observed inundation areas. This comparison between two independent validation data sets suggests that validation metrics using insurance claims can be compared to conventional validation data, such as the flooded area. However, a validation on the basis of insurance claims might be more conservative in cases where model errors are more pronounced in areas with a high density of values at risk.

  19. Assessing cultural validity in standardized tests in stem education

    NASA Astrophysics Data System (ADS)

    Gassant, Lunes

    This quantitative ex post facto study examined how race and gender, as elements of culture, influence the development of common misconceptions among STEM students. Primary data came from a standardized test: the Digital Logic Concept Inventory (DLCI) developed by Drs. Geoffrey L. Herman, Michael C. Louis, and Craig Zilles from the University of Illinois at Urbana-Champaign. The sample consisted of a cohort of 82 STEM students recruited from three universities in Northern Louisiana. Microsoft Excel and the Statistical Package for the Social Sciences (SPSS) were used for data computation. Two key concepts, several sub concepts, and 19 misconceptions were tested through 11 items in the DLCI. Statistical analyses based on both the Classical Test Theory (Spearman, 1904) and the Item Response Theory (Lord, 1952) yielded similar results: some misconceptions in the DLCI can reliably be predicted by the Race or the Gender of the test taker. The research is significant because it has shown that some misconceptions in a STEM discipline attracted students with similar ethnic backgrounds differently; thus, leading to the existence of some cultural bias in the standardized test. Therefore the study encourages further research in cultural validity in standardized tests. With culturally valid tests, it will be possible to increase the effectiveness of targeted teaching and learning strategies for STEM students from diverse ethnic backgrounds. To some extent, this dissertation has contributed to understanding, better, the gap between high enrollment rates and low graduation rates among African American students and also among other minority students in STEM disciplines.

  20. Translation and validation of the Malay version of the Stroke Knowledge Test.

    PubMed

    Sowtali, Siti Noorkhairina; Yusoff, Dariah Mohd; Harith, Sakinah; Mohamed, Monniaty

    2016-04-01

    To date, there is a lack of published studies on assessment tools to evaluate the effectiveness of stroke education programs. This study developed and validated the Malay language version of the Stroke Knowledge Test research instrument. This study involved translation, validity, and reliability phases. The instrument underwent backward and forward translation of the English version into the Malay language. Nine experts reviewed the content for consistency, clarity, difficulty, and suitability for inclusion. Perceived usefulness and utilization were obtained from experts' opinions. Later, face validity assessment was conducted with 10 stroke patients to determine appropriateness of sentences and grammar used. A pilot study was conducted with 41 stroke patients to determine the item analysis and reliability of the translated instrument using the Kuder Richardson 20 or Cronbach's alpha. The final Malay version Stroke Knowledge Test included 20 items with good content coverage, acceptable item properties, and positive expert review ratings. Psychometric investigations suggest that Malay version Stroke Knowledge Test had moderate reliability with Kuder Richardson 20 or Cronbach's alpha of 0.58. Improvement is required for Stroke Knowledge Test items with unacceptable difficulty indices. Overall, the average rating of perceived usefulness and perceived utility of the instruments were both 72.7%, suggesting that reviewers were likely to use the instruments in their facilities. Malay version Stroke Knowledge Test was a valid and reliable tool to assess educational needs and to evaluate stroke knowledge among participants of group-based stroke education programs in Malaysia.

  1. Victoria Symptom Validity Test performance in children and adolescents with neurological disorders.

    PubMed

    Brooks, Brian L

    2012-12-01

    It is becoming increasingly more important to study, use, and promote the utility of measures that are designed to detect non-compliance with testing (i.e., poor effort, symptom non-validity, response bias) as part of neuropsychological assessments with children and adolescents. Several measures have evidence for use in pediatrics, but there is a paucity of published support for the Victoria Symptom Validity Test (VSVT) in this population. The purpose of this study was to examine the performance on the VSVT in a sample of pediatric patients with known neurological disorders. The sample consisted of 100 consecutively referred children and adolescents between the ages of 6 and 19 years (mean = 14.0, SD = 3.1) with various neurological diagnoses. On the VSVT total items, 95% of the sample had performance in the "valid" range, with 5% being deemed "questionable" and 0% deemed "invalid". On easy items, 97% were "valid", 2% were "questionable", and 1% was "invalid." For difficult items, 84% were "valid," 16% were "questionable," and 0% was "invalid." For those patients given two effort measures (i.e., VSVT and Test of Memory Malingering; n = 65), none was identified as having poor test-taking compliance on both measures. VSVT scores were significantly correlated with age, intelligence, processing speed, and functional ratings of daily abilities (attention, executive functioning, and adaptive functioning), but not objective performance on the measure of sustained attention, verbal memory, or visual memory. The VSVT has potential to be used in neuropsychological assessments with pediatric patients.

  2. Social validity in single-case research: A systematic literature review of prevalence and application.

    PubMed

    Snodgrass, Melinda R; Chung, Moon Y; Meadan, Hedda; Halle, James W

    2018-03-01

    Single-case research (SCR) has been a valuable methodology in special education research. Montrose Wolf (1978), an early pioneer in single-case methodology, coined the term "social validity" to refer to the social importance of the goals selected, the acceptability of procedures employed, and the effectiveness of the outcomes produced in applied investigations. Since 1978, many contributors to SCR have included social validity as a feature of their articles and several authors have examined the prevalence and role of social validity in SCR. We systematically reviewed all SCR published in six highly-ranked special education journals from 2005 to 2016 to establish the prevalence of social validity assessments and to evaluate their scientific rigor. We found relatively low, but stable prevalence with only 28 publications addressing all three factors of the social validity construct (i.e., goals, procedures, outcomes). We conducted an in-depth analysis of the scientific rigor of these 28 publications. Social validity remains an understudied construct in SCR, and the scientific rigor of social validity assessments is often lacking. Implications and future directions are discussed. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Validating the Astronomy Diagnostics Test for Undergraduate Non-Science Majors

    NASA Astrophysics Data System (ADS)

    Slater, T. F.; Hufnagel, B.; Adams, J. P.

    1999-05-01

    The Astronomy Diagnostics Test (ADT) is a standard diagnostic test for undergraduate non-science majors taking introductory astronomy. Serving to compare the effectiveness of various instructional interventions, the ADT has been developed and field-tested over the last year by a multi-institutional team, known as the Collaboration for Astronomy Education Research (CAER). The team includes Jeff Adams, Rebecca Lindell Adrian, Christine Brick, Gina Brissenden, Grace Deming, Beth Hufnagel, Tim Slater, and Michael Zeilik, among others. The need for a nationally normed, valid, and reliable assessment instrument in astronomy has been articulated in a wide variety of forums. This need results from the simultaneous occurrence of several important phenomena over the last decade including: the inclusion of astronomy concepts in national science education standards; documentation of widespread astronomical misconceptions; the influence of the Force Concept Inventory guiding reform in physics; and the call for university faculty to document improvements in instruction. In a triangulated effort to validate the ADT for widespread use, the researchers used on a three-phase strategy. In this context, "validity" means that the ADT measures what it purports to measure. In other words, do students give the correct answer for the scientifically correct reasons or, alternatively, do students give the correct answer even though they have misunderstandings about the phenomena being tested? These three phases were: (1) conduct statistical item-analysis on each test question for a large and diverse student population (n=2000 from 21 institutions); (2) conduct 60 clinical student interviews using the test questions as the script; and (3) conduct an inductive analysis of 30 student supplied written responses to ADT questions posed without the multiple-choices provided. The ADT and its supporting comparative database is available at URL: http://solar.physics.montana.edu/aae/adt/. This research

  4. Validation of Sherouk's Critical Thinking Test (SH-CTT)

    ERIC Educational Resources Information Center

    Kadhm, Sherouk J.

    2017-01-01

    This study aimed to examine the psychometric properties (reliability and validity) of the Arabic version of Sherouk's Critical Thinking Test. This test has four parts, each of which provides a story that is divided into an introduction and a scene; each story is then followed by a list of sensitive questions featuring two response options…

  5. Psychometric Arabic Sino-Nasal Outcome Test-22: validation and translation in chronic rhinosinusitis patients.

    PubMed

    Alanazy, Fatma; Dousary, Surayie Al; Albosaily, Ahmed; Aldriweesh, Turki; Alsaleh, Saad; Aldrees, Turki

    2018-01-01

    The Sino-Nasal Outcome Test (SNOT)-22 has multiple items that reflect how nasal disease affects quality of life. Currently, no validated Arabic version of the SNOT-22 is available. . To develop an Arabic-validated version of SNOT-22. Prospective. Tertiary care center. This single-center validation study was conducted between 2015 and 2017 at King Abdul-Aziz University Hospital, Riyadh, Saudi Arabia. The SNOT-22 English version was translated into Arabic by the forward and backward method. The test and retest reliability, internal consistency, responsiveness to surgical treatment, discriminant validity, sensitivity and specificity all were tested. Validated Arabic version of the SNOT-22. Of 265 individuals, 171 were healthy volunteers and 94 were chronic rhinosinusitis patients. The Arabic version showed high internal consistency (Cronbach's of 0.94), and the ability to differentiate between diseased and healthy volunteers (P < .001). The translated versions demonstrated the ability to detect the change scores significantly in response to intervention (P < .001). This is the first validated Arabic version of SNOT-22. The instrument can be used among the Arabic population. No subjects from other Arab countries.

  6. The ad-libitum alcohol 'taste test': secondary analyses of potential confounds and construct validity.

    PubMed

    Jones, Andrew; Button, Emily; Rose, Abigail K; Robinson, Eric; Christiansen, Paul; Di Lemma, Lisa; Field, Matt

    2016-03-01

    Motivation to drink alcohol can be measured in the laboratory using an ad-libitum 'taste test', in which participants rate the taste of alcoholic drinks whilst their intake is covertly monitored. Little is known about the construct validity of this paradigm. The objective of this study was to investigate variables that may compromise the validity of this paradigm and its construct validity. We re-analysed data from 12 studies from our laboratory that incorporated an ad-libitum taste test. We considered time of day and participants' awareness of the purpose of the taste test as potential confounding variables. We examined whether gender, typical alcohol consumption, subjective craving, scores on the Alcohol Use Disorders Identification Test and perceived pleasantness of the drinks predicted ad-libitum consumption (construct validity). We included 762 participants (462 female). Participant awareness and time of day were not related to ad-libitum alcohol consumption. Males drank significantly more alcohol than females (p < 0.001), and individual differences in typical alcohol consumption (p = 0.04), craving (p < 0.001) and perceived pleasantness of the drinks (p = 0.04) were all significant predictors of ad-libitum consumption. We found little evidence that time of day or participant awareness influenced alcohol consumption. The construct validity of the taste test was supported by relationships between ad-libitum consumption and typical alcohol consumption, craving and pleasantness ratings of the drinks. The ad-libitum taste test is a valid method for the assessment of alcohol intake in the laboratory.

  7. Ecological validity of the five digit test and the oral trails test.

    PubMed

    Paiva, Gabrielle Chequer de Castro; Fialho, Mariana Braga; Costa, Danielle de Souza; Paula, Jonas Jardim de

    2016-01-01

    Tests evaluating the attentional-executive system are widely used in clinical practice. However, proximity of an objective cognitive test with real-world situations (ecological validity) is not frequently investigated. The present study evaluate the association between measures of the Five Digit Test (FDT) and the Oral Trails Test (OTT) with self-reported cognitive failures in everyday life as measured by the Cognitive Failures Questionnaire (CFQ). Brazilian adults from 18-to-65 years old voluntarily performed the FDT and OTT tests and reported the frequency of cognitive failures in their everyday life through the CFQ. After controlling for the age effect, the measures of controlled attentional processes were associated with cognitive failures, yet the cognitive flexibility of both FDT and OTT accounted for by the majority of variance in most aspects of the CFQ factors. The FDT and the OTT measures were predictive of real-world problems such as cognitive failures in everyday activities/situations.

  8. Issues in cross-cultural validity: example from the adaptation, reliability, and validity testing of a Turkish version of the Stanford Health Assessment Questionnaire.

    PubMed

    Küçükdeveci, Ayse A; Sahin, Hülya; Ataman, Sebnem; Griffiths, Bridget; Tennant, Alan

    2004-02-15

    Guidelines have been established for cross-cultural adaptation of outcome measures. However, invariance across cultures must also be demonstrated through analysis of Differential Item Functioning (DIF). This is tested in the context of a Turkish adaptation of the Health Assessment Questionnaire (HAQ). Internal construct validity of the adapted HAQ is assessed by Rasch analysis; reliability, by internal consistency and the intraclass correlation coefficient; external construct validity, by association with impairments and American College of Rheumatology functional stages. Cross-cultural validity is tested through DIF by comparison with data from the UK version of the HAQ. The adapted version of the HAQ demonstrated good internal construct validity through fit of the data to the Rasch model (mean item fit 0.205; SD 0.998). Reliability was excellent (alpha = 0.97) and external construct validity was confirmed by expected associations. DIF for culture was found in only 1 item. Cross-cultural validity was found to be sufficient for use in international studies between the UK and Turkey. Future adaptation of instruments should include analysis of DIF at the field testing stage in the adaptation process.

  9. Validation of Test Performance and Clinical Time Zero for an Electronic Health Record Embedded Severe Sepsis Alert.

    PubMed

    Rolnick, Joshua; Downing, N Lance; Shepard, John; Chu, Weihan; Tam, Julia; Wessels, Alexander; Li, Ron; Dietrich, Brian; Rudy, Michael; Castaneda, Leon; Shieh, Lisa

    2016-01-01

    Increasing use of EHRs has generated interest in the potential of computerized clinical decision support to improve treatment of sepsis. Electronic sepsis alerts have had mixed results due to poor test characteristics, the inability to detect sepsis in a timely fashion and the use of outside software limiting widespread adoption. We describe the development, evaluation and validation of an accurate and timely severe sepsis alert with the potential to impact sepsis management. To develop, evaluate, and validate an accurate and timely severe sepsis alert embedded in a commercial EHR. The sepsis alert was developed by identifying the most common severe sepsis criteria among a cohort of patients with ICD 9 codes indicating a diagnosis of sepsis. This alert requires criteria in three categories: indicators of a systemic inflammatory response, evidence of suspected infection from physician orders, and markers of organ dysfunction. Chart review was used to evaluate test performance and the ability to detect clinical time zero, the point in time when a patient develops severe sepsis. Two physicians reviewed 100 positive cases and 75 negative cases. Based on this review, sensitivity was 74.5%, specificity was 86.0%, the positive predictive value was 50.3%, and the negative predictive value was 94.7%. The most common source of end-organ dysfunction was MAP less than 70 mm/Hg (59%). The alert was triggered at clinical time zero in 41% of cases and within three hours in 53.6% of cases. 96% of alerts triggered before a manual nurse screen. We are the first to report the time between a sepsis alert and physician chart-review clinical time zero. Incorporating physician orders in the alert criteria improves specificity while maintaining sensitivity, which is important to reduce alert fatigue. By leveraging standard EHR functionality, this alert could be implemented by other healthcare systems.

  10. Validation of Test Performance and Clinical Time Zero for an Electronic Health Record Embedded Severe Sepsis Alert

    PubMed Central

    Downing, N. Lance; Shepard, John; Chu, Weihan; Tam, Julia; Wessels, Alexander; Li, Ron; Dietrich, Brian; Rudy, Michael; Castaneda, Leon; Shieh, Lisa

    2016-01-01

    Summary Bachground Increasing use of EHRs has generated interest in the potential of computerized clinical decision support to improve treatment of sepsis. Electronic sepsis alerts have had mixed results due to poor test characteristics, the inability to detect sepsis in a timely fashion and the use of outside software limiting widespread adoption. We describe the development, evaluation and validation of an accurate and timely severe sepsis alert with the potential to impact sepsis management. Objective To develop, evaluate, and validate an accurate and timely severe sepsis alert embedded in a commercial EHR. Methods The sepsis alert was developed by identifying the most common severe sepsis criteria among a cohort of patients with ICD 9 codes indicating a diagnosis of sepsis. This alert requires criteria in three categories: indicators of a systemic inflammatory response, evidence of suspected infection from physician orders, and markers of organ dysfunction. Chart review was used to evaluate test performance and the ability to detect clinical time zero, the point in time when a patient develops severe sepsis. Results Two physicians reviewed 100 positive cases and 75 negative cases. Based on this review, sensitivity was 74.5%, specificity was 86.0%, the positive predictive value was 50.3%, and the negative predictive value was 94.7%. The most common source of end-organ dysfunction was MAP less than 70 mm/Hg (59%). The alert was triggered at clinical time zero in 41% of cases and within three hours in 53.6% of cases. 96% of alerts triggered before a manual nurse screen. Conclusion We are the first to report the time between a sepsis alert and physician chart-review clinical time zero. Incorporating physician orders in the alert criteria improves specificity while maintaining sensitivity, which is important to reduce alert fatigue. By leveraging standard EHR functionality, this alert could be implemented by other healthcare systems. PMID:27437061

  11. Use of the color trails test as an embedded measure of performance validity.

    PubMed

    Henry, George K; Algina, James

    2013-01-01

    One hundred personal injury litigants and disability claimants referred for a forensic neuropsychological evaluation were administered both portions of the Color Trails Test (CTT) as part of a more comprehensive battery of standardized tests. Subjects who failed two or more free-standing tests of cognitive performance validity formed the Failed Performance Validity (FPV) group, while subjects who passed all free-standing performance validity measures were assigned to the Passed Performance Validity (PPV) group. A cutscore of ≥45 seconds to complete Color Trails 1 (CT1) was associated with a classification accuracy of 78%, good sensitivity (66%) and high specificity (90%), while a cutscore of ≥84 seconds to complete Color Trails 2 (CT2) was associated with a classification accuracy of 82%, good sensitivity (74%) and high specificity (90%). A CT1 cutscore of ≥58 seconds, and a CT2 cutscore ≥100 seconds was associated with 100% positive predictive power at base rates from 20 to 50%.

  12. Criterion Related Validity of Karate Specific Aerobic Test (KSAT).

    PubMed

    Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim

    2015-09-01

    Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE'KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT's TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT's TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE's KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE's KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT requires further investigation.

  13. Implementation and Initial Validation of the MDTP Tests at Golden West College.

    ERIC Educational Resources Information Center

    Isonio, Steven

    In 1992, a study was conducted at Golden West College (California) to determine the predictive validity of the Math Diagnostic Testing Project (MDTP) tests. A total of 1,137 students were tested in-class; 601 took the Algebra Readiness test, 376 took the Elementary Algebra test, and 160 took the Intermediate Algebra test. Two correlation…

  14. Review of the Reported Measures of Clinical Validity and Clinical Utility as Arguments for the Implementation of Pharmacogenetic Testing: A Case Study of Statin-Induced Muscle Toxicity.

    PubMed

    Jansen, Marleen E; Rigter, T; Rodenburg, W; Fleur, T M C; Houwink, E J F; Weda, M; Cornel, Martina C

    2017-01-01

    Advances from pharmacogenetics (PGx) have not been implemented into health care to the expected extent. One gap that will be addressed in this study is a lack of reporting on clinical validity and clinical utility of PGx-tests. A systematic review of current reporting in scientific literature was conducted on publications addressing PGx in the context of statins and muscle toxicity. Eighty-nine publications were included and information was selected on reported measures of effect, arguments, and accompanying conclusions. Most authors report associations to quantify the relationship between a genetic variation an outcome, such as adverse drug responses. Conclusions on the implementation of a PGx-test are generally based on these associations, without explicit mention of other measures relevant to evaluate the test's clinical validity and clinical utility. To gain insight in the clinical impact and select useful tests, additional outcomes are needed to estimate the clinical validity and utility, such as cost-effectiveness.

  15. Validation of Cardiovascular Parameters during NASA's Functional Task Test

    NASA Technical Reports Server (NTRS)

    Arzeno, N. M.; Stenger, M. B.; Bloomberg, J. J.; Platts, S. H.

    2009-01-01

    Microgravity exposure causes physiological deconditioning and impairs crewmember task performance. The Functional Task Test (FTT) is designed to correlate these physiological changes to performance in a series of operationally-relevant tasks. One of these, the Recovery from Fall/Stand Test (RFST), tests both the ability to recover from a prone position and cardiovascular responses to orthostasis. PURPOSE: Three minutes were chosen for the duration of this test, yet it is unknown if this is long enough to induce cardiovascular responses similar to the operational 5 min stand test. The purpose of this study was to determine the validity and reliability of heart rate variability (HRV) analysis of a 3 min stand and to examine the effect of spaceflight on these measures. METHODS: To determine the validity of using 3 vs. 5 min of standing to assess HRV, ECG was collected from 7 healthy subjects who participated in a 6 min RFST. Mean R-R interval (RR) and spectral HRV were measured in minutes 0-3 and 0-5 following the heart rate transient due to standing. Significant differences between the segments were determined by a paired t-test. To determine the reliability of the 3-min stand test, 13 healthy subjects completed 3 trials of the FTT on separate days, including the RFST with a 3 min stand. Analysis of variance (ANOVA) was performed on the HRV measures. One crewmember completed the FTT before a 14-day mission, on landing day (R+0) and one (R+1) day after returning to Earth. RESULTS VALIDITY: HRV measures reflecting autonomic activity were not significantly different during the 0-3 and 0-5 min segments. RELIABILITY: The average coefficient of variation for RR, systolic (SBP) and diastolic blood pressures during the RFST were less than 8% for the 3 sessions. ANOVA results yielded a greater inter-subject variability (p<0.006) than inter-session variability (p>0.05) for HRV in the RFST. SPACEFLIGHT: Lower RR and higher SBP were observed on R+0 in rest and stand. On R+1

  16. Vertical jumping tests in volleyball: reliability, validity, and playing-position specifics.

    PubMed

    Sattler, Tine; Sekulic, Damir; Hadzic, Vedran; Uljevic, Ognjen; Dervisevic, Edvin

    2012-06-01

    Vertical jumping is known to be important in volleyball, and jumping performance tests are frequently studied for their reliability and validity. However, most studies concerning jumping in volleyball have dealt with standard rather than sport-specific jumping procedures and tests. The aims of this study, therefore, were (a) to determine the reliability and factorial validity of 2 volleyball-specific jumping tests, the block jump (BJ) test and the attack jump (AJ) test, relative to 2 frequently used and systematically validated jumping tests, the countermovement jump test and the squat jump test and (b) to establish volleyball position-specific differences in the jumping tests and simple anthropometric indices (body height [BH], body weight, and body mass index [BMI]). The BJ was performed from a defensive volleyball position, with the hands positioned in front of the chest. During an AJ, the players used a 2- to 3-step approach and performed a drop jump with an arm swing followed by a quick vertical jump. A total of 95 high-level volleyball players (all men) participated in this study. The reliability of the jumping tests ranged from 0.97 to 0.99 for Cronbach's alpha coefficients, from 0.93 to 0.97 for interitem correlation coefficients and from 2.1 to 2.8 for coefficients of variation. The highest reliability was found for the specific jumping tests. The factor analysis extracted one significant component, and all of the tests were highly intercorrelated. The analysis of variance with post hoc analysis showed significant differences between 5 playing positions in some of the jumping tests. In general, receivers had a greater jumping capacity, followed by libero players. The differences in jumping capacities should be emphasized vis-a-vis differences in the anthropometric measures of players, where middle hitters had higher BH and body weight, followed by opposite hitters and receivers, with no differences in the BMI between positions.

  17. Perspectives on Validation of High-Throughput Assays Supporting 21st Century Toxicity Testing1

    PubMed Central

    Judson, Richard; Kavlock, Robert; Martin, Matt; Reif, David; Houck, Keith; Knudsen, Thomas; Richard, Ann; Tice, Raymond R.; Whelan, Maurice; Xia, Menghang; Huang, Ruili; Austin, Christopher; Daston, George; Hartung, Thomas; Fowle, John R.; Wooge, William; Tong, Weida; Dix, David

    2014-01-01

    Summary In vitro, high-throughput screening (HTS) assays are seeing increasing use in toxicity testing. HTS assays can simultaneously test many chemicals, but have seen limited use in the regulatory arena, in part because of the need to undergo rigorous, time-consuming formal validation. Here we discuss streamlining the validation process, specifically for prioritization applications in which HTS assays are used to identify a high-concern subset of a collection of chemicals. The high-concern chemicals could then be tested sooner rather than later in standard guideline bioassays. The streamlined validation process would continue to ensure the reliability and relevance of assays for this application. We discuss the following practical guidelines: (1) follow current validation practice to the extent possible and practical; (2) make increased use of reference compounds to better demonstrate assay reliability and relevance; (3) deemphasize the need for cross-laboratory testing, and; (4) implement a web-based, transparent and expedited peer review process. PMID:23338806

  18. Parameterization of Model Validating Sets for Uncertainty Bound Optimizations. Revised

    NASA Technical Reports Server (NTRS)

    Lim, K. B.; Giesy, D. P.

    2000-01-01

    Given measurement data, a nominal model and a linear fractional transformation uncertainty structure with an allowance on unknown but bounded exogenous disturbances, easily computable tests for the existence of a model validating uncertainty set are given. Under mild conditions, these tests are necessary and sufficient for the case of complex, nonrepeated, block-diagonal structure. For the more general case which includes repeated and/or real scalar uncertainties, the tests are only necessary but become sufficient if a collinearity condition is also satisfied. With the satisfaction of these tests, it is shown that a parameterization of all model validating sets of plant models is possible. The new parameterization is used as a basis for a systematic way to construct or perform uncertainty tradeoff with model validating uncertainty sets which have specific linear fractional transformation structure for use in robust control design and analysis. An illustrative example which includes a comparison of candidate model validating sets is given.

  19. Reliability and criterion-related validity testing (construct) of the Endotracheal Suction Assessment Tool (ESAT©).

    PubMed

    Davies, Kylie; Bulsara, Max K; Ramelet, Anne-Sylvie; Monterosso, Leanne

    2018-05-01

    To establish criterion-related construct validity and test-retest reliability for the Endotracheal Suction Assessment Tool© (ESAT©). Endotracheal tube suction performed in children can significantly affect clinical stability. Previously identified clinical indicators for endotracheal tube suction were used as criteria when designing the ESAT©. Content validity was reported previously. The final stages of psychometric testing are presented. Observational testing was used to measure construct validity and determine whether the ESAT© could guide "inexperienced" paediatric intensive care nurses' decision-making regarding endotracheal tube suction. Test-retest reliability of the ESAT© was performed at two time points. The researchers and paediatric intensive care nurse "experts" developed 10 hypothetical clinical scenarios with predetermined endotracheal tube suction outcomes. "Experienced" (n = 12) and "inexperienced" (n = 14) paediatric intensive care nurses were presented with the scenarios and the ESAT© guiding decision-making about whether to perform endotracheal tube suction for each scenario. Outcomes were compared with those predetermined by the "experts" (n = 9). Test-retest reliability of the ESAT© was measured at two consecutive time points (4 weeks apart) with "experienced" and "inexperienced" paediatric intensive care nurses using the same scenarios and tool to guide decision-making. No differences were observed between endotracheal tube suction decisions made by "experts" (n = 9), "inexperienced" (n = 14) and "experienced" (n = 12) nurses confirming the tool's construct validity. No differences were observed between groups for endotracheal tube suction decisions at T1 and T2. Criterion-related construct validity and test-retest reliability of the ESAT© were demonstrated. Further testing is recommended to confirm reliability in the clinical setting with the "inexperienced" nurse to guide decision-making related to endotracheal tube

  20. Ultrasonic linear array validation via concrete test blocks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hoegh, Kyle, E-mail: hoeg0021@umn.edu; Khazanovich, Lev, E-mail: hoeg0021@umn.edu; Ferraro, Chris

    2015-03-31

    Oak Ridge National Laboratory (ORNL) comparatively evaluated the ability of a number of NDE techniques to generate an image of the volume of 6.5′ X 5.0′ X 10″ concrete specimens fabricated at the Florida Department of Transportation (FDOT) NDE Validation Facility in Gainesville, Florida. These test blocks were fabricated to test the ability of various NDE methods to characterize various placements and sizes of rebar as well as simulated cracking and non-consolidation flaws. The first version of the ultrasonic linear array device, MIRA [version 1], was one of 7 different NDE equipment used to characterize the specimens. This paper dealsmore » with the ability of this equipment to determine subsurface characterizations such as reinforcing steel relative size, concrete thickness, irregularities, and inclusions using Kirchhoff-based migration techniques. The ability of individual synthetic aperture focusing technique (SAFT) B-scan cross sections resulting from self-contained scans are compared with various processing, analysis, and interpretation methods using the various features fabricated in the specimens for validation. The performance is detailed, especially with respect to the limitations and implications for evaluation of a thicker, more heavily reinforced concrete structures.« less

  1. Recommendations for elaboration, transcultural adaptation and validation process of tests in Speech, Hearing and Language Pathology.

    PubMed

    Pernambuco, Leandro; Espelt, Albert; Magalhães, Hipólito Virgílio; Lima, Kenio Costa de

    2017-06-08

    to present a guide with recommendations for translation, adaptation, elaboration and process of validation of tests in Speech and Language Pathology. the recommendations were based on international guidelines with a focus on the elaboration, translation, cross-cultural adaptation and validation process of tests. the recommendations were grouped into two Charts, one of them with procedures for translation and transcultural adaptation and the other for obtaining evidence of validity, reliability and measures of accuracy of the tests. a guide with norms for the organization and systematization of the process of elaboration, translation, cross-cultural adaptation and validation process of tests in Speech and Language Pathology was created.

  2. Development of Modal Test Techniques for Validation of a Solar Sail Design

    NASA Technical Reports Server (NTRS)

    Gaspar, James L.; Mann, Troy; Behun, Vaughn; Wilkie, W. Keats; Pappa, Richard

    2004-01-01

    This paper focuses on the development of modal test techniques for validation of a solar sail gossamer space structure design. The major focus is on validating and comparing the capabilities of various excitation techniques for modal testing solar sail components. One triangular shaped quadrant of a solar sail membrane was tested in a 1 Torr vacuum environment using various excitation techniques including, magnetic excitation, and surface-bonded piezoelectric patch actuators. Results from modal tests performed on the sail using piezoelectric patches at different positions are discussed. The excitation methods were evaluated for their applicability to in-vacuum ground testing and to the development of on orbit flight test techniques. The solar sail membrane was tested in the horizontal configuration at various tension levels to assess the variation in frequency with tension in a vacuum environment. A segment of a solar sail mast prototype was also tested in ambient atmospheric conditions using various excitation techniques, and these methods are also assessed for their ground test capabilities and on-orbit flight testing.

  3. Truth and Evidence in Validity Theory

    ERIC Educational Resources Information Center

    Borsboom, Denny; Markus, Keith A.

    2013-01-01

    According to Kane (this issue), "the validity of a proposed interpretation or use depends on how well the evidence supports" the claims being made. Because truth and evidence are distinct, this means that the validity of a test score interpretation could be high even though the interpretation is false. As an illustration, we discuss the case of…

  4. Quantification of construction waste prevented by BIM-based design validation: Case studies in South Korea.

    PubMed

    Won, Jongsung; Cheng, Jack C P; Lee, Ghang

    2016-03-01

    Waste generated in construction and demolition processes comprised around 50% of the solid waste in South Korea in 2013. Many cases show that design validation based on building information modeling (BIM) is an effective means to reduce the amount of construction waste since construction waste is mainly generated due to improper design and unexpected changes in the design and construction phases. However, the amount of construction waste that could be avoided by adopting BIM-based design validation has been unknown. This paper aims to estimate the amount of construction waste prevented by a BIM-based design validation process based on the amount of construction waste that might be generated due to design errors. Two project cases in South Korea were studied in this paper, with 381 and 136 design errors detected, respectively during the BIM-based design validation. Each design error was categorized according to its cause and the likelihood of detection before construction. The case studies show that BIM-based design validation could prevent 4.3-15.2% of construction waste that might have been generated without using BIM. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Proposal and validation of a clinical trunk control test in individuals with spinal cord injury.

    PubMed

    Quinzaños, J; Villa, A R; Flores, A A; Pérez, R

    2014-06-01

    One of the problems that arise in spinal cord injury (SCI) is alteration in trunk control. Despite the need for standardized scales, these do not exist for evaluating trunk control in SCI. To propose and validate a trunk control test in individuals with SCI. National Institute of Rehabilitation, Mexico. The test was developed and later evaluated for reliability and criteria, content, and construct validity. We carried out 531 tests on 177 patients and found high inter- and intra-rater reliability. In terms of criterion validity, analysis of variance demonstrated a statistically significant difference in the test score of patients with adequate or inadequate trunk control according to the assessment of a group of experts. A receiver operating characteristic curve was plotted for optimizing the instrument's cutoff point, which was determined at 13 points, with a sensitivity of 98% and a specificity of 92.2%. With regard to construct validity, the correlation between the proposed test and the spinal cord independence measure (SCIM) was 0.873 (P=0.001) and that with the evolution time was 0.437 (P=0.001). For testing the hypothesis with qualitative variables, the Kruskal-Wallis test was performed, which resulted in a statistically significant difference between the scores in the proposed scale of each group defined by these variables. It was proven experimentally that the proposed trunk control test is valid and reliable. Furthermore, the test can be used for all patients with SCI despite the type and level of injury.

  6. Reliability and Validity of the Inline Skating Skill Test.

    PubMed

    Radman, Ivan; Ruzic, Lana; Padovan, Viktoria; Cigrovski, Vjekoslav; Podnar, Hrvoje

    2016-09-01

    This study aimed to examine the reliability and validity of the inline skating skill test. Based on previous skating experience forty-two skaters (26 female and 16 male) were randomized into two groups (competitive level vs. recreational level). They performed the test four times, with a recovery time of 45 minutes between sessions. Prior to testing, the participants rated their skating skill using a scale from 1 to 10. The protocol included performance time measurement through a course, combining different skating techniques. Trivial changes in performance time between the repeated sessions were determined in both competitive females/males and recreational females/males (-1.7% [95% CI: -5.8-2.6%] - 2.2% [95% CI: 0.0-4.5%]). In all four subgroups, the skill test had a low mean within-individual variation (1.6% [95% CI: 1.2-2.4%] - 2.7% [95% CI: 2.1-4.0%]) and high mean inter-session correlation (ICC = 0.97 [95% CI: 0.92-0.99] - 0.99 [95% CI: 0.98-1.00]). The comparison of detected typical errors and smallest worthwhile changes (calculated as standard deviations × 0.2) revealed that the skill test was able to track changes in skaters' performances. Competitive-level skaters needed shorter time (24.4-26.4%, all p < 0.01) to complete the test in comparison to recreational-level skaters. Moreover, moderate correlation (ρ = 0.80-0.82; all p < 0.01) was observed between the participant's self-rating and achieved performance times. In conclusion, the proposed test is a reliable and valid method to evaluate inline skating skills in amateur competitive and recreational level skaters. Further studies are needed to evaluate the reproducibility of this skill test in different populations including elite inline skaters.

  7. Converting Hangar High Expansion Foam Systems to Prevent Cockpit Damage: Full-Scale Validation Tests

    DTIC Science & Technology

    2017-09-01

    AFCEC-CO-TY-TR-2018-0001 CONVERTING HANGAR HIGH EXPANSION FOAM SYSTEMS TO PREVENT COCKPIT DAMAGE: FULL-SCALE VALIDATION TESTS Gerard G...REPORT NUMBER(S) 12. DISTRIBUTION/ AVAILABILITY STATEMENT 13. SUPPLEMENTARY NOTES 14. ABSTRACT 15. SUBJECT TERMS 16. SECURITY CLASSIFICATION OF: a. REPORT b...09-2017 Final Test Report May 2017 Converting Hangar High Expansion Foam Systems to Prevent Cockpit Damage: Full-Scale Validation Tests N00173-15-D

  8. The bogus taste test: Validity as a measure of laboratory food intake.

    PubMed

    Robinson, Eric; Haynes, Ashleigh; Hardman, Charlotte A; Kemps, Eva; Higgs, Suzanne; Jones, Andrew

    2017-09-01

    Because overconsumption of food contributes to ill health, understanding what affects how much people eat is of importance. The 'bogus' taste test is a measure widely used in eating behaviour research to identify factors that may have a causal effect on food intake. However, there has been no examination of the validity of the bogus taste test as a measure of food intake. We conducted a participant level analysis of 31 published laboratory studies that used the taste test to measure food intake. We assessed whether the taste test was sensitive to experimental manipulations hypothesized to increase or decrease food intake. We examined construct validity by testing whether participant sex, hunger and liking of taste test food were associated with the amount of food consumed in the taste test. In addition, we also examined whether BMI (body mass index), trait measures of dietary restraint and over-eating in response to palatable food cues were associated with food consumption. Results indicated that the taste test was sensitive to experimental manipulations hypothesized to increase or decrease food intake. Factors that were reliably associated with increased consumption during the taste test were being male, have a higher baseline hunger, liking of the taste test food and a greater tendency to overeat in response to palatable food cues, whereas trait dietary restraint and BMI were not. These results indicate that the bogus taste test is likely to be a valid measure of food intake and can be used to identify factors that have a causal effect on food intake. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.

  9. Reliability and criterion-related validity of a new repeated agility test

    PubMed Central

    Makni, E; Jemni, M; Elloumi, M; Chamari, K; Nabli, MA; Padulo, J; Moalla, W

    2016-01-01

    The study aimed to assess the reliability and the criterion-related validity of a new repeated sprint T-test (RSTT) that includes intense multidirectional intermittent efforts. The RSTT consisted of 7 maximal repeated executions of the agility T-test with 25 s of passive recovery rest in between. Forty-five team sports players performed two RSTTs separated by 3 days to assess the reliability of best time (BT) and total time (TT) of the RSTT. The intra-class correlation coefficient analysis revealed a high relative reliability between test and retest for BT and TT (>0.90). The standard error of measurement (<0.50) showed that the RSTT has a good absolute reliability. The minimal detectable change values for BT and TT related to the RSTT were 0.09 s and 0.58 s, respectively. To check the criterion-related validity of the RSTT, players performed a repeated linear sprint (RLS) and a repeated sprint with changes of direction (RSCD). Significant correlations between the BT and TT of the RLS, RSCD and RSTT were observed (p<0.001). The RSTT is, therefore, a reliable and valid measure of the intermittent repeated sprint agility performance. As this ability is required in all team sports, it is suggested that team sports coaches, fitness coaches and sports scientists consider this test in their training follow-up. PMID:27274109

  10. The Thinking-about-Derivative Test for Undergraduate Students: Development and Validation

    ERIC Educational Resources Information Center

    Aydin, Utkun; Ubuz, Behiye

    2015-01-01

    Two studies were conducted for the development and validation of a multidimensional test to assess undergraduate students' mathematical thinking about derivative. The first study involved two phases: question generation and refinement of the Thinking-about-Derivative Test (TDT). The second study included four phases as follows: test…

  11. Incompleteness of Bluetooth protocol conformance test cases

    NASA Astrophysics Data System (ADS)

    Wu, Peng; Gao, Qiang

    2001-10-01

    This paper describes a formal method to verify the completeness of conformance testing, in which not only Implementation Under Test (IUT) is formalized in SDL, but also conformance tester is described in SDL so that conformance testing can be performed in simulator provided with CASE tool. The protocol set considered is Bluetooth, an open wireless communication technology. Our research results show that Bluetooth conformance test specification is not complete in that it has only limited coverage and many important capabilities defined in Bluetooth core specification are not tested. We also give a detail report on the missing test cases against Bluetooth core specification, and provide a guide on further test case generation in the future.

  12. Development and validation of a new instrument for testing functional health literacy in Japanese adults.

    PubMed

    Nakagami, Katsuyuki; Yamauchi, Toyoaki; Noguchi, Hiroyuki; Maeda, Tohru; Nakagami, Tomoko

    2014-06-01

    This study aimed to develop a reliable and valid measure of functional health literacy in a Japanese clinical setting. Test development consisted of three phases: generation of an item pool, consultation with experts to assess content validity, and comparison with external criteria (the Japanese Health Knowledge Test) to assess criterion validity. A trial version of the test was administered to 535 Japanese outpatients. Internal consistency reliability, calculated by Cronbach's alpha, was 0.81, and concurrent validity was moderate. Receiver Operating Characteristics and Item Response Theory were used to classify patients as having adequate, marginal, or inadequate functional health literacy. Both inadequate and marginal functional health literacy were associated with older age, lower income, lower educational attainment, and poor health knowledge. The time required to complete the test was 10-15 min. This test should enable health workers to better identify patients with inadequate health literacy. © 2013 Wiley Publishing Asia Pty Ltd.

  13. Testing the Predictive Validity of the Hendrich II Fall Risk Model.

    PubMed

    Jung, Hyesil; Park, Hyeoun-Ae

    2018-03-01

    Cumulative data on patient fall risk have been compiled in electronic medical records systems, and it is possible to test the validity of fall-risk assessment tools using these data between the times of admission and occurrence of a fall. The Hendrich II Fall Risk Model scores assessed during three time points of hospital stays were extracted and used for testing the predictive validity: (a) upon admission, (b) when the maximum fall-risk score from admission to falling or discharge, and (c) immediately before falling or discharge. Predictive validity was examined using seven predictive indicators. In addition, logistic regression analysis was used to identify factors that significantly affect the occurrence of a fall. Among the different time points, the maximum fall-risk score assessed between admission and falling or discharge showed the best predictive performance. Confusion or disorientation and having a poor ability to rise from a sitting position were significant risk factors for a fall.

  14. Groundwater Model Validation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ahmed E. Hassan

    2006-01-24

    Models have an inherent uncertainty. The difficulty in fully characterizing the subsurface environment makes uncertainty an integral component of groundwater flow and transport models, which dictates the need for continuous monitoring and improvement. Building and sustaining confidence in closure decisions and monitoring networks based on models of subsurface conditions require developing confidence in the models through an iterative process. The definition of model validation is postulated as a confidence building and long-term iterative process (Hassan, 2004a). Model validation should be viewed as a process not an end result. Following Hassan (2004b), an approach is proposed for the validation process ofmore » stochastic groundwater models. The approach is briefly summarized herein and detailed analyses of acceptance criteria for stochastic realizations and of using validation data to reduce input parameter uncertainty are presented and applied to two case studies. During the validation process for stochastic models, a question arises as to the sufficiency of the number of acceptable model realizations (in terms of conformity with validation data). Using a hierarchical approach to make this determination is proposed. This approach is based on computing five measures or metrics and following a decision tree to determine if a sufficient number of realizations attain satisfactory scores regarding how they represent the field data used for calibration (old) and used for validation (new). The first two of these measures are applied to hypothetical scenarios using the first case study and assuming field data consistent with the model or significantly different from the model results. In both cases it is shown how the two measures would lead to the appropriate decision about the model performance. Standard statistical tests are used to evaluate these measures with the results indicating they are appropriate measures for evaluating model realizations. The use of

  15. Reliability and Validity Testing of the Physical Resilience Measure

    ERIC Educational Resources Information Center

    Resnick, Barbara; Galik, Elizabeth; Dorsey, Susan; Scheve, Ann; Gutkin, Susan

    2011-01-01

    Objective: The purpose of this study was to test reliability and validity of the Physical Resilience Scale. Methods: A single-group repeated measure design was used and 130 older adults from three different housing sites participated. Participants completed the Physical Resilience Scale, Hardy-Gill Resilience Scale, 14-item Resilience Scale,…

  16. Contemporary Test Validity in Theory and Practice: A Primer for Discipline-Based Education Researchers.

    PubMed

    Reeves, Todd D; Marbach-Ad, Gili

    2016-01-01

    Most discipline-based education researchers (DBERs) were formally trained in the methods of scientific disciplines such as biology, chemistry, and physics, rather than social science disciplines such as psychology and education. As a result, DBERs may have never taken specific courses in the social science research methodology--either quantitative or qualitative--on which their scholarship often relies so heavily. One particular aspect of (quantitative) social science research that differs markedly from disciplines such as biology and chemistry is the instrumentation used to quantify phenomena. In response, this Research Methods essay offers a contemporary social science perspective on test validity and the validation process. The instructional piece explores the concepts of test validity, the validation process, validity evidence, and key threats to validity. The essay also includes an in-depth example of a validity argument and validation approach for a test of student argument analysis. In addition to DBERs, this essay should benefit practitioners (e.g., lab directors, faculty members) in the development, evaluation, and/or selection of instruments for their work assessing students or evaluating pedagogical innovations. © 2016 T. D. Reeves and G. Marbach-Ad. CBE—Life Sciences Education © 2016 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

  17. Validation of a Video-based Game-Understanding Test Procedure in Badminton.

    ERIC Educational Resources Information Center

    Blomqvist, Minna T.; Luhtanen, Pekka; Laakso, Lauri; Keskinen, Esko

    2000-01-01

    Reports the development and validation of video-based game-understanding tests in badminton for elementary and secondary students. The tests included different sequences that simulated actual game situations. Players had to solve tactical problems by selecting appropriate solutions and arguments for their decisions. Results suggest that the test…

  18. Validity Inferences under High-Stakes Conditions: A Response from Language Testing

    ERIC Educational Resources Information Center

    Hill, Kathryn; McNamara, Tim

    2015-01-01

    Those who work in second- and foreign-language testing often find Koretz's concern for validity inferences under high-stakes (VIHS) conditions both welcome and familiar. While the focus of the article is more narrowly on the potential for two instructional responses to test-based accountability, "reallocation" and "coaching,"…

  19. WEC-SIM Phase 1 Validation Testing -- Numerical Modeling of Experiments: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ruehl, Kelley; Michelen, Carlos; Bosma, Bret

    2016-08-01

    The Wave Energy Converter Simulator (WEC-Sim) is an open-source code jointly developed by Sandia National Laboratories and the National Renewable Energy Laboratory. It is used to model wave energy converters subjected to operational and extreme waves. In order for the WEC-Sim code to be beneficial to the wave energy community, code verification and physical model validation is necessary. This paper describes numerical modeling of the wave tank testing for the 1:33-scale experimental testing of the floating oscillating surge wave energy converter. The comparison between WEC-Sim and the Phase 1 experimental data set serves as code validation. This paper is amore » follow-up to the WEC-Sim paper on experimental testing, and describes the WEC-Sim numerical simulations for the floating oscillating surge wave energy converter.« less

  20. Validation of laboratory-scale recycling test method of paper PSA label products

    Treesearch

    Carl Houtman; Karen Scallon; Richard Oldack

    2008-01-01

    Starting with test methods and a specification developed by the U.S. Postal Service (USPS) Environmentally Benign Pressure Sensitive Adhesive Postage Stamp Program, a laboratory-scale test method and a specification were developed and validated for pressure-sensitive adhesive labels, By comparing results from this new test method and pilot-scale tests, which have been...

  1. CAPTIONALS: A computer aided testing environment for the verification and validation of communication protocols

    NASA Technical Reports Server (NTRS)

    Feng, C.; Sun, X.; Shen, Y. N.; Lombardi, Fabrizio

    1992-01-01

    This paper covers the verification and protocol validation for distributed computer and communication systems using a computer aided testing approach. Validation and verification make up the so-called process of conformance testing. Protocol applications which pass conformance testing are then checked to see whether they can operate together. This is referred to as interoperability testing. A new comprehensive approach to protocol testing is presented which address: (1) modeling for inter-layer representation for compatibility between conformance and interoperability testing; (2) computational improvement to current testing methods by using the proposed model inclusive of formulation of new qualitative and quantitative measures and time-dependent behavior; (3) analysis and evaluation of protocol behavior for interactive testing without extensive simulation.

  2. The Validity of Value-Added Estimates from Low-Stakes Testing Contexts: The Impact of Change in Test-Taking Motivation and Test Consequences

    ERIC Educational Resources Information Center

    Finney, Sara J.; Sundre, Donna L.; Swain, Matthew S.; Williams, Laura M.

    2016-01-01

    Accountability mandates often prompt assessment of student learning gains (e.g., value-added estimates) via achievement tests. The validity of these estimates have been questioned when performance on tests is low stakes for students. To assess the effects of motivation on value-added estimates, we assigned students to one of three test consequence…

  3. Validation of software for calculating the likelihood ratio for parentage and kinship.

    PubMed

    Drábek, J

    2009-03-01

    Although the likelihood ratio is a well-known statistical technique, commercial off-the-shelf (COTS) software products for its calculation are not sufficiently validated to suit general requirements for the competence of testing and calibration laboratories (EN/ISO/IEC 17025:2005 norm) per se. The software in question can be considered critical as it directly weighs the forensic evidence allowing judges to decide on guilt or innocence or to identify person or kin (i.e.: in mass fatalities). For these reasons, accredited laboratories shall validate likelihood ratio software in accordance with the above norm. To validate software for calculating the likelihood ratio in parentage/kinship scenarios I assessed available vendors, chose two programs (Paternity Index and familias) for testing, and finally validated them using tests derived from elaboration of the available guidelines for the field of forensics, biomedicine, and software engineering. MS Excel calculation using known likelihood ratio formulas or peer-reviewed results of difficult paternity cases were used as a reference. Using seven testing cases, it was found that both programs satisfied the requirements for basic paternity cases. However, only a combination of two software programs fulfills the criteria needed for our purpose in the whole spectrum of functions under validation with the exceptions of providing algebraic formulas in cases of mutation and/or silent allele.

  4. Rigging Test Bed Development for Validation of Multi-Stage Decelerator Extractions

    NASA Technical Reports Server (NTRS)

    Kenig, Sivan J.; Gallon, John C.; Adams, Douglas S.; Rivellini, Tommaso P.

    2013-01-01

    The Low Density Supersonic Decelerator project is developing new decelerator systems for Mars entry which would include testing with a Supersonic Flight Dynamics Test Vehicle. One of the decelerator systems being developed is a large supersonic ringsail parachute. Due to the configuration of the vehicle it is not possible to deploy the parachute with a mortar which would be the preferred method for a spacecraft in a supersonic flow. Alternatively, a multi-stage extraction process using a ballute as a pilot is being developed for the test vehicle. The Rigging Test Bed is a test venue being constructed to perform verification and validation of this extraction process. The test bed consists of a long pneumatic piston device capable of providing a constant force simulating the ballute drag force during the extraction events. The extraction tests will take place both inside a high-bay for frequent tests of individual extraction stages and outdoors using a mobile hydraulic crane for complete deployment tests from initial pack pull out to canopy extraction. These tests will measure line tensions and use photogrammetry to track motion of the elements involved. The resulting data will be used to verify packing and rigging as well, as validate models and identify potential failure modes in order to finalize the design of the extraction system.

  5. Validity of a basketball-specific complex test in female professional players.

    PubMed

    Schwesig, René; Hermassi, Souhail; Lauenroth, Andreas; Laudner, Kevin; Koke, Alexander; Bartels, Thomas; Delank, Stefan; Schulze, Stephan

    2018-06-01

    The purpose of this study was to assess the validity of a new basketball-specific complex test (BBCT) based on the ascertained match performance.Fourteen female professional basketball players (ages: 23.4 ± 1.8 years) performed the BBCT and a treadmill test (TT) at the beginning of pre-season training. Lactate, heart rate (HR), time, shooting precision and number of errors were measured during the four test sequences of the BBCT (short distance sprinting with direction changes, with and without a ball; fast break; lay-up parcours; sprint endurance test). In addition, lactate threshold (LT) and HR were assessed at selected times throughout the TT and the BBCT and over 6 (TT) or 10 (BBCT) minutes after the tests. The match performance score (mps) was calculated on specific parameters (e. g. points) collected during all matches during the subsequent season (22 matches). The mps served as the "gold standard" within the validation process for the BBCT and the TT.TT parameters demonstrated an explained variance (EV) between 0 % (HR recovery) and 11 % (running speed at 6 mmol/l LT). The EV from the BBCT was higher and ranged from 0 % (HR recovery 6 minutes after end of exercise) to 28 % (sprint endurance test after 8 of 10 sprints). Ten out of 21 BBCT parameters (48 %) and 2 out of 5 TT parameters (40 %) demonstrated an EV higher than 10 %. Average EV for all parameters was 12 % (BBCT) and 6 % (TT), respectively. The BBCT had a higher validity than the TT for predicting match performance. These findings suggest that coaches and scientists should consider using the BBCT testing protocol to estimate the match performance abilities of elite female players. © Georg Thieme Verlag KG Stuttgart · New York.

  6. Simple shoulder test and Oxford Shoulder Score: Persian translation and cross-cultural validation.

    PubMed

    Naghdi, Soofia; Nakhostin Ansari, Noureddin; Rustaie, Nilufar; Akbari, Mohammad; Ebadi, Safoora; Senobari, Maryam; Hasson, Scott

    2015-12-01

    To translate, culturally adapt, and validate the simple shoulder test (SST) and Oxford Shoulder Score (OSS) into Persian language using a cross-sectional and prospective cohort design. A standard forward and backward translation was followed to culturally adapt the SST and the OSS into Persian language. Psychometric properties of floor and ceiling effects, construct convergent validity, discriminant validity, internal consistency reliability, test-retest reliability, standard error of the measurement (SEM), smallest detectable change (SDC), and factor structure were determined. One hundred patients with shoulder disorders and 50 healthy subjects participated in the study. The PSST and the POSS showed no missing responses. No floor or ceiling effects were observed. Both the PSST and POSS detected differences between patients and healthy subjects supporting their discriminant validity. Construct convergent validity was confirmed by a very good correlation between the PSST and POSS (r = 0.68). There was high internal consistency for both the PSST (α = 0.73) and the POSS (α = 0.91 and 0.92). Test-retest reliability with 1-week interval was excellent (ICCagreement = 0.94 for PSST and 0.90 for POSS). Factor analyses demonstrated a three-factor solution for the PSST (49.7 % of variance) and a two-factor solution for the POSS (61.6 % of variance). The SEM/SDC was satisfactory for PSST (5.5/15.3) and POSS (6.8/18.8). The PSST and POSS are valid and reliable outcome measures for assessing functional limitations in Persian-speaking patients with shoulder disorders.

  7. Exploring the reliability and validity of the social-moral awareness test.

    PubMed

    Livesey, Alexandra; Dodd, Karen; Pote, Helen; Marlow, Elizabeth

    2012-11-01

    The aim of the study was to explore the validity of the social-moral awareness test (SMAT) a measure designed for assessing socio-moral rule knowledge and reasoning in people with learning disabilities. Comparisons between Theory of Mind and socio-moral reasoning allowed the exploration of construct validity of the tool. Factor structure, reliability and discriminant validity were also assessed. Seventy-one participants with mild-moderate learning disabilities completed the two scales of the SMAT and two False Belief Tasks for Theory of Mind. Reliability of the SMAT was very good, and the scales were shown to be uni-dimensional in factor structure. There was a significant positive relationship between Theory of Mind and both SMAT scales. There is early evidence of the construct validity and reliability of the SMAT. Further assessment of the validity of the SMAT will be required. © 2012 Blackwell Publishing Ltd.

  8. Publishing nutrition research: validity, reliability, and diagnostic test assessment in nutrition-related research.

    PubMed

    Gleason, Philip M; Harris, Jeffrey; Sheean, Patricia M; Boushey, Carol J; Bruemmer, Barbara

    2010-03-01

    This is the sixth in a series of monographs on research design and analysis. The purpose of this article is to describe and discuss several concepts related to the measurement of nutrition-related characteristics and outcomes, including validity, reliability, and diagnostic tests. The article reviews the methodologic issues related to capturing the various aspects of a given nutrition measure's reliability, including test-retest, inter-item, and interobserver or inter-rater reliability. Similarly, it covers content validity, indicators of absolute vs relative validity, and internal vs external validity. With respect to diagnostic assessment, the article summarizes the concepts of sensitivity and specificity. The hope is that dietetics practitioners will be able to both use high-quality measures of nutrition concepts in their research and recognize these measures in research completed by others. Copyright 2010 American Dietetic Association. Published by Elsevier Inc. All rights reserved.

  9. Performance Validity Testing in Neuropsychology: Methods for Measurement Development and Maximizing Diagnostic Accuracy.

    PubMed

    Wodushek, Thomas R; Greher, Michael R

    2017-05-01

    In the first column in this 2-part series, Performance Validity Testing in Neuropsychology: Scientific Basis and Clinical Application-A Brief Review, the authors introduced performance validity tests (PVTs) and their function, provided a justification for why they are necessary, traced their ongoing endorsement by neuropsychological organizations, and described how they are used and interpreted by ever increasing numbers of clinical neuropsychologists. To enhance readers' understanding of these measures, this second column briefly describes common detection strategies used in PVTs as well as the typical methods used to validate new PVTs and determine cut scores for valid/invalid determinations. We provide a discussion of the latest research demonstrating how neuropsychologists can combine multiple PVTs in a single battery to improve sensitivity/specificity to invalid responding. Finally, we discuss future directions for the research and application of PVTs.

  10. Development of a framework for international certification by OIE of diagnostic tests validated as fit for purpose.

    PubMed

    Wright, P; Edwards, S; Diallo, A; Jacobson, R

    2006-01-01

    Historically, the OIE has focused on test methods applicable to trade and the international movement of animals and animal products. With its expanding role as the World Organisation for Animal Health, the OIE has recognised the need to evaluate test methods relative to specific diagnostic applications other than trade. In collaboration with its international partners, the OIE solicited input from experts through consultants' meetings on the development of guidelines for validation and certification of diagnostic assays for infectious animal diseases. Recommendations from the first meeting were formally adopted and have subsequently been acted upon by the OIE. A validation template has been developed that specifically requires a test to be fit or suited for its intended purpose (e.g. as a screening or a confirmatory test). This is a key criterion for validation. The template incorporates four distinct stages of validation, each of which has bearing on the evaluation of fitness for purpose. The OIE has just recently created a registry for diagnostic tests that fulfil these validation requirements. Assay developers are invited to submit validation dossiers to the OIE for evaluation by a panel of experts. Recognising that validation is an incremental process, tests methods achieving at least the first stages of validation may be provisionally accepted. To provide additional confidence in assay performance, the OIE, through its network of Reference Laboratories, has embarked on the development of evaluation panels. These panels would contain specially selected test samples that would assist in verifying fitness for purpose.

  11. Criterion Related Validity of Karate Specific Aerobic Test (KSAT)

    PubMed Central

    Chaabene, Helmi; Hachana, Younes; Franchini, Emerson; Tabben, Montassar; Mkaouer, Bessem; Negra, Yassine; Hammami, Mehrez; Chamari, Karim

    2015-01-01

    Background: Karate is one the most popular combat sports in the world. Physical fitness assessment on a regular manner is important for monitoring the effectiveness of the training program and the readiness of karatekas to compete. Objectives: The aim of this research was to examine the criterion related to validity of the karate specific aerobic test (KSAT) as an indicator of aerobic level of karate practitioners. Patients and Methods: Cardiorespiratory responses, aerobic performance level through both treadmill laboratory test and YoYo intermittent recovery test level 1 (YoYoIRTL1) as well as time to exhaustion in the KSAT test (TE’KSAT) were determined in a total of fifteen healthy international karatekas (i.e. karate practitioners) (means ± SD: age: 22.2 ± 4.3 years; height: 176.4 ± 7.5 cm; body mass: 70.3 ± 9.7 kg and body fat: 13.2 ± 6%). Results: Peak heart rate obtained from KSAT represented ~99% of maximal heart rate registered during the treadmill test showing that KSAT imposes high physiological demands. There was no significant correlation between KSAT’s TE and relative (mL/min kg) treadmill maximal oxygen uptake (r = 0.14; P = 0.69; [small]). On the other hand, there was a significant relationship between KSAT’s TE and the velocity associated with VO2max (vVO2max) (r = 0.67; P = 0.03; [large]) as well as the velocity at VO2 corresponding to the second ventilatory threshold (vVO2 VAT) (r = 0.64; P = 0.04; [large]). Moreover, significant relationship was found between TE’s KSAT and both the total distance covered and parameters of intermittent endurance measured through YoYoIRTL1. Conclusions: The KSAT has not proved to have indirect criterion related validity as no significant correlations have been found between TE’s KSAT and treadmill VO2max. Nevertheless, as correlated to other aerobic fitness variables, KSAT can be considered as an indicator of karate specific endurance. The establishment of the criterion related validity of the KSAT

  12. Item Development and Validity Testing for a Self- and Proxy Report: The Safe Driving Behavior Measure

    PubMed Central

    Classen, Sherrilene; Winter, Sandra M.; Velozo, Craig A.; Bédard, Michel; Lanford, Desiree N.; Brumback, Babette; Lutz, Barbara J.

    2010-01-01

    OBJECTIVE We report on item development and validity testing of a self-report older adult safe driving behaviors measure (SDBM). METHOD On the basis of theoretical frameworks (Precede–Proceed Model of Health Promotion, Haddon’s matrix, and Michon’s model), existing driving measures, and previous research and guided by measurement theory, we developed items capturing safe driving behavior. Item development was further informed by focus groups. We established face validity using peer reviewers and content validity using expert raters. RESULTS Peer review indicated acceptable face validity. Initial expert rater review yielded a scale content validity index (CVI) rating of 0.78, with 44 of 60 items rated ≥0.75. Sixteen unacceptable items (≤0.5) required major revision or deletion. The next CVI scale average was 0.84, indicating acceptable content validity. CONCLUSION The SDBM has relevance as a self-report to rate older drivers. Future pilot testing of the SDBM comparing results with on-road testing will define criterion validity. PMID:20437917

  13. Comprehensive validation scheme for in situ fiber optics dissolution method for pharmaceutical drug product testing.

    PubMed

    Mirza, Tahseen; Liu, Qian Julie; Vivilecchia, Richard; Joshi, Yatindra

    2009-03-01

    There has been a growing interest during the past decade in the use of fiber optics dissolution testing. Use of this novel technology is mainly confined to research and development laboratories. It has not yet emerged as a tool for end product release testing despite its ability to generate in situ results and efficiency improvement. One potential reason may be the lack of clear validation guidelines that can be applied for the assessment of suitability of fiber optics. This article describes a comprehensive validation scheme and development of a reliable, robust, reproducible and cost-effective dissolution test using fiber optics technology. The test was successfully applied for characterizing the dissolution behavior of a 40-mg immediate-release tablet dosage form that is under development at Novartis Pharmaceuticals, East Hanover, New Jersey. The method was validated for the following parameters: linearity, precision, accuracy, specificity, and robustness. In particular, robustness was evaluated in terms of probe sampling depth and probe orientation. The in situ fiber optic method was found to be comparable to the existing manual sampling dissolution method. Finally, the fiber optic dissolution test was successfully performed by different operators on different days, to further enhance the validity of the method. The results demonstrate that the fiber optics technology can be successfully validated for end product dissolution/release testing. (c) 2008 Wiley-Liss, Inc. and the American Pharmacists Association

  14. Fundamentals of endoscopic surgery: creation and validation of the hands-on test.

    PubMed

    Vassiliou, Melina C; Dunkin, Brian J; Fried, Gerald M; Mellinger, John D; Trus, Thadeus; Kaneva, Pepa; Lyons, Calvin; Korndorffer, James R; Ujiki, Michael; Velanovich, Vic; Kochman, Michael L; Tsuda, Shawn; Martinez, Jose; Scott, Daniel J; Korus, Gary; Park, Adrian; Marks, Jeffrey M

    2014-03-01

    The Fundamentals of Endoscopic Surgery™ (FES) program consists of online materials and didactic and skills-based tests. All components were designed to measure the skills and knowledge required to perform safe flexible endoscopy. The purpose of this multicenter study was to evaluate the reliability and validity of the hands-on component of the FES examination, and to establish the pass score. Expert endoscopists identified the critical skill set required for flexible endoscopy. They were then modeled in a virtual reality simulator (GI Mentor™ II, Simbionix™ Ltd., Airport City, Israel) to create five tasks and metrics. Scores were designed to measure both speed and precision. Validity evidence was assessed by correlating performance with self-reported endoscopic experience (surgeons and gastroenterologists [GIs]). Internal consistency of each test task was assessed using Cronbach's alpha. Test-retest reliability was determined by having the same participant perform the test a second time and comparing their scores. Passing scores were determined by a contrasting groups methodology and use of receiver operating characteristic curves. A total of 160 participants (17 % GIs) performed the simulator test. Scores on the five tasks showed good internal consistency reliability and all had significant correlations with endoscopic experience. Total FES scores correlated 0.73, with participants' level of endoscopic experience providing evidence of their validity, and their internal consistency reliability (Cronbach's alpha) was 0.82. Test-retest reliability was assessed in 11 participants, and the intraclass correlation was 0.85. The passing score was determined and is estimated to have a sensitivity (true positive rate) of 0.81 and a 1-specificity (false positive rate) of 0.21. The FES hands-on skills test examines the basic procedural components required to perform safe flexible endoscopy. It meets rigorous standards of reliability and validity required for high

  15. Construct validity of the Free and Cued Selective Reminding Test in older adults with memory complaints.

    PubMed

    Clerici, Francesca; Ghiretti, Roberta; Di Pucchio, Alessandra; Pomati, Simone; Cucumo, Valentina; Marcone, Alessandra; Vanacore, Nicola; Mariani, Claudio; Cappa, Stefano Francesco

    2017-06-01

    The Free and Cued Selective Reminding Test (FCSRT) is the memory test recommended by the International Working Group on Alzheimer's disease (AD) for the detection of amnestic syndrome of the medial temporal type in prodromal AD. Assessing the construct validity and internal consistency of the Italian version of the FCSRT is thus crucial. The FCSRT was administered to 338 community-dwelling participants with memory complaints (57% females, age 74.5 ± 7.7 years), including 34 with AD, 203 with Mild Cognitive Impairment, and 101 with Subjective Memory Impairment. Internal Consistency was estimated using Cronbach's alpha coefficient. To assess convergent validity, five FCSRT scores (Immediate Free Recall, Immediate Total Recall, Delayed Free Recall, Delayed Total Recall, and Index of Sensitivity of Cueing) were correlated with three well-validated memory tests: Story Recall, Rey Auditory Verbal Learning test, and Rey Complex Figure (RCF) recall (partial correlation analysis). To assess divergent validity, a principal component analysis (an exploratory factor analysis) was performed including, in addition to the above-mentioned memory tasks, the following tests: Word Fluencies, RCF copy, Clock Drawing Test, Trail Making Test, Frontal Assessment Battery, Raven Coloured Progressive Matrices, and Stroop Colour-Word Test. Cronbach's alpha coefficients for immediate recalls (IFR and ITR) and delayed recalls (DFR and DTR) were, respectively, .84 and .81. All FCSRT scores were highly correlated with those of the three well-validated memory tests. The factor analysis showed that the FCSRT does not load on the factors saturated by non-memory tests. These findings indicate that the FCSRT has a good internal consistency and has an excellent construct validity as an episodic memory measure. © 2015 The British Psychological Society.

  16. Pretest information for a test to validate plume simulation procedures (FA-17)

    NASA Technical Reports Server (NTRS)

    Hair, L. M.

    1978-01-01

    The results of an effort to plan a final verification wind tunnel test to validate the recommended correlation parameters and application techniques were presented. The test planning effort was complete except for test site finalization and the associated coordination. Two suitable test sites were identified. Desired test conditions were shown. Subsequent sections of this report present the selected model and test site, instrumentation of this model, planned test operations, and some concluding remarks.

  17. Developing and Validating Proof Comprehension Tests in Undergraduate Mathematics

    ERIC Educational Resources Information Center

    Mejía-Ramos, Juan Pablo; Lew, Kristen; de la Torre, Jimmy; Weber, Keith

    2017-01-01

    In this article, we describe and illustrate the process by which we developed and validated short, multiple-choice, reliable tests to assess undergraduate students' comprehension of three mathematical proofs. We discuss the purpose for each stage and how it benefited the design of our instruments. We also suggest ways in which this process could…

  18. Item difficulty and item validity for the Children's Group Embedded Figures Test.

    PubMed

    Rusch, R R; Trigg, C L; Brogan, R; Petriquin, S

    1994-02-01

    The validity and reliability of the Children's Group Embedded Figures Test was reported for students in Grade 2 by Cromack and Stone in 1980; however, a search of the literature indicates no evidence for internal consistency or item analysis. Hence the purpose of this study was to examine the item difficulty and item validity of the test with children in Grades 1 and 2. Confusion in the literature over development and use of this test was seemingly resolved through analysis of these descriptions and through an interview with the test developer. One early-appearing item was unreasonably difficult. Two or three other items were quite difficult and made little contribution to the total score. Caution is recommended, however, in any reordering or elimination of items based on these findings, given the limited number of subjects (n = 84).

  19. The prone bridge test: Performance, validity, and reliability among older and younger adults.

    PubMed

    Bohannon, Richard W; Steffl, Michal; Glenney, Susan S; Green, Michelle; Cashwell, Leah; Prajerova, Kveta; Bunn, Jennifer

    2018-04-01

    The prone bridge maneuver, or plank, has been viewed as a potential alternative to curl-ups for assessing trunk muscle performance. The purpose of this study was to assess prone bridge test performance, validity, and reliability among younger and older adults. Sixty younger (20-35 years old) and 60 older (60-79 years old) participants completed this study. Groups were evenly divided by sex. Participants completed surveys regarding physical activity and abdominal exercise participation. Height, weight, body mass index (BMI), and waist circumference were measured. On two occasions, 5-9 days apart, participants held a prone bridge until volitional exhaustion or until repeated technique failure. Validity was examined using data from the first session: convergent validity by calculating correlations between survey responses, anthropometrics, and prone bridge time, known groups validity by using an ANOVA comparing bridge times of younger and older adults and of men and women. Test-retest reliability was examined by using a paired t-test to compare prone bridge times for Session1 and Session 2. Furthermore, an intraclass correlation coefficient (ICC) was used to characterize relative reliability and minimal detectable change (MDC 95% ) was used to describe absolute reliability. The mean prone bridge time was 145.3 ± 71.5 s, and was positively correlated with physical activity participation (p ≤ 0.001) and negatively correlated with BMI and waist circumference (p ≤ 0.003). Younger participants had significantly longer plank times than older participants (p = 0.003). The ICC between testing sessions was 0.915. The prone bridge test is a valid and reliable measure for evaluating abdominal performance in both younger and older adults. Copyright © 2017 Elsevier Ltd. All rights reserved.

  20. Reliability and Validity of Information about Student Achievement: Comparing Large-Scale and Classroom Testing Contexts

    ERIC Educational Resources Information Center

    Cizek, Gregory J.

    2009-01-01

    Reliability and validity are two characteristics that must be considered whenever information about student achievement is collected. However, those characteristics--and the methods for evaluating them--differ in large-scale testing and classroom testing contexts. This article presents the distinctions between reliability and validity in the two…

  1. Evaluation and Validation of Case 2 Algorithms in Chesapeake Bay

    NASA Technical Reports Server (NTRS)

    Harding, Lawrence W., Jr.; Magnuson, Adrea

    2004-01-01

    The high temporal and spatial resolution of satellite ocean color observations will prove invaluable for monitoring the health of coastal ecosystems where physical and biological variability demands sampling scales beyond that possible by ship. However, ocean color remote sensing of Case 2 waters is a challenging undertaking due to the optical complexity of the water. The focus of this SIMBIOS support has been to provide in situ optical measurements form Chesapeake Bay (CB) and adjacent mid-Atlantic bight (MAB) waters for use in algorithm development and validation efforts to improve the satellite retrieval of chlorophyll (chl a) in Case 2 waters. CB provides a valuable site for validation of data from ocean color sensors for a number of reasons. First, the physical dimensions of the Bay (greater than 6,500 square kilometers) make retrievals from satellites with a spatial resolution of approximately 1 kilometer (i.e., SeaWiFS) or less (i.e., MODIS) reasonable for most of the ecosystem. Second, CB is highly influenced by freshwater flow from major rivers, making it a classic Case 2 water body with significant concentrations of chlorophyll, particulates and chromophoric dissolved organic matter (CDOM) that highly impact the shape of reflectance spectra. Finally, past and ongoing research efforts provided an expensive data set of optical observations that support the goal of this project.

  2. Inter-Rater Reliability and Validity of the Australian Football League’s Kicking and Handball Tests

    PubMed Central

    Cripps, Ashley J.; Hopper, Luke S.; Joyce, Christopher

    2015-01-01

    Talent identification tests used at the Australian Football League’s National Draft Combine assess the capacities of athletes to compete at a professional level. Tests created for the National Draft Combine are also commonly used for talent identification and athlete development in development pathways. The skills tests created by the Australian Football League required players to either handball (striking the ball with the hand) or kick to a series of 6 randomly generated targets. Assessors subjectively rate each skill execution giving a 0-5 score for each disposal. This study aimed to investigate the inter-rater reliability and validity of the skills tests at an adolescent sub-elite level. Male Australian footballers were recruited from sub-elite adolescent teams (n = 121, age = 15.7 ± 0.3 years, height = 1.77 ± 0.07 m, mass = 69.17 ± 8.08 kg). The coaches (n = 7) of each team were also recruited. Inter-rater reliability was assessed using Inter-class correlations (ICC) and Limits of Agreement statistics. Both the kicking (ICC = 0.96, p < .01) and handball tests (ICC = 0.89, p < .01) demonstrated strong reliability and acceptable levels of absolute agreement. Content validity was determined by examining the test scores sensitivity to laterality and distance. Concurrent validity was assessed by comparing coaches’ perceptions of skill to actual test outcomes. Multivariate analysis of variance (MANOVA) examined the main effect of laterality, with scores on the dominant hand (p = .04) and foot (p < .01) significantly higher compared to the non-dominant side. Follow-up univariate analysis reported significant differences at every distance in the kicking test. A poor correlation was found between coaches’ perceptions of skill and testing outcomes. The results of this study demonstrate both skill tests demonstrate acceptable inter-rater reliable. Partial content validity was confirmed for the kicking test, however further research is required to confirm

  3. A Rasch-Based Validation of the Vocabulary Size Test

    ERIC Educational Resources Information Center

    Beglar, David

    2010-01-01

    The primary purpose of this study was to provide preliminary validity evidence for a 140-item form of the Vocabulary Size Test, which is designed to measure written receptive knowledge of the first 14,000 words of English. Nineteen native speakers of English and 178 native speakers of Japanese participated in the study. Analyses based on the Rasch…

  4. Validity and test-retest reliability of an at-work production loss instrument.

    PubMed

    Aboagye, E; Jensen, I; Bergström, G; Hagberg, J; Axén, I; Lohela-Karlsson, M

    2016-07-01

    Besides causing ill health, a poor work environment may contribute to production loss. Production loss assessment instruments emphasize health-related consequences but there is no instrument to measure reduced work performance related to the work environment. To examine convergent validity and test-retest reliability of health-related production loss (HRPL) and work environment-related production loss (WRPL) against a valid comparable instrument, the Health and Work Performance Questionnaire (HPQ). Cross-sectional study of employees, not on sick leave, who were asked to self-rate their work performance and production losses. Using the Pearson correlation and Bland and Altman's Test of Agreement, convergent validity was examined. Subgroup analyses were performed for employees recording problem-specific reduced work performance. Consistency of pairs of HRPL and WRPL for samples responding to both assessments was expressed using Intraclass Correlation Coefficient (ICC) and tests of repeatability. A total of 88 employees participated and 44 responded to both assessments. Test of agreement between measurements estimates a mean difference of 0.34 for HRPL and -0.03 for WRPL compared with work performance. This indicates that the production loss questions are valid and moderately associated with work performance for the total sample and subgroups. ICC for paired HRPL assessments was 0.90 and 0.91 for WRPL, i.e. the test-retest reliability was good and suggests stability in the instrument. HRPL and WRPL can be used to measure production loss due to health-related and work environment-related problems. These results may have implications for advancing methods of assessing production loss, which represents an important cost to employers. © The Author 2016. Published by Oxford University Press on behalf of the Society of Occupational Medicine. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  5. An investigation of new toxicity test method performance in validation studies: 1. Toxicity test methods that have predictive capacity no greater than chance.

    PubMed

    Bruner, L H; Carr, G J; Harbell, J W; Curren, R D

    2002-06-01

    An approach commonly used to measure new toxicity test method (NTM) performance in validation studies is to divide toxicity results into positive and negative classifications, and the identify true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results. After this step is completed, the contingent probability statistics (CPS), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) are calculated. Although these statistics are widely used and often the only statistics used to assess the performance of toxicity test methods, there is little specific guidance in the validation literature on what values for these statistics indicate adequate performance. The purpose of this study was to begin developing data-based answers to this question by characterizing the CPS obtained from an NTM whose data have a completely random association with a reference test method (RTM). Determining the CPS of this worst-case scenario is useful because it provides a lower baseline from which the performance of an NTM can be judged in future validation studies. It also provides an indication of relationships in the CPS that help identify random or near-random relationships in the data. The results from this study of randomly associated tests show that the values obtained for the statistics vary significantly depending on the cut-offs chosen, that high values can be obtained for individual statistics, and that the different measures cannot be considered independently when evaluating the performance of an NTM. When the association between results of an NTM and RTM is random the sum of the complementary pairs of statistics (sensitivity + specificity, NPV + PPV) is approximately 1, and the prevalence (i.e., the proportion of toxic chemicals in the population of chemicals) and PPV are equal. Given that combinations of high sensitivity-low specificity or low specificity-high sensitivity (i.e., the sum of the sensitivity and

  6. Spanish Transcultural Adaptation and Validity of the Behavioral Inattention Test

    PubMed Central

    Sánchez-Cabeza, Ángel; Huertas-Hoyas, Elisabet; Máximo-Bocanegra, Nuria; Rosa María Martínez-Piédrola; Pérez-de-Heredia-Torres, Marta

    2017-01-01

    Objective To adapt, validate, and translate the Behavioral Inattention Test as an assessment tool for Spanish individuals with unilateral spatial neglect. Design A cross-sectional descriptive study. Setting University laboratories. Participants A sample of 75 Spanish stroke patients and 18 healthy control subjects. Interventions Not applicable. Main Outcome Measures The Behavioral Inattention Test. Results The Spanish version of the Behavioral Inattention Test shows a high degree of reliability both in the complete test (α = .90) and in the conventional (α = .93) and behavioral subtests (α = .75). The concurrent validity between the total conventional and behavioral scores was high (r = −.80; p < 0.001). Significant differences were found between patients with and without unilateral spatial neglect (p < 0.001). In the comparison between right and left damaged sides, differences were found in all items, except for article reading (p = 0.156) and card sorting (p = 0.117). Conclusions This measure is a useful tool for evaluating unilateral spatial neglect as it provides information on everyday problems. The BIT discriminates between stroke patients with and without unilateral spatial neglect. This measure constitutes a reliable tool for the diagnosis, planning, performance, and design of specific treatment programs intended to improve the functionality and quality of life of people with unilateral spatial neglect. PMID:29097959

  7. Examining the Validity of GED[R] Tests Scores with Scheduling and Setting Accommodations. GED Testing Service Research Studies, 2004-1

    ERIC Educational Resources Information Center

    George-Ezzelle, Carol E.; Skaggs, Gary

    2004-01-01

    Current testing standards call for test developers to provide evidence that testing procedures and test scores, and the inferences made based on the test scores, show evidence of validity and are comparable across subpopulations (American Educational Research Association [AERA], American Psychological Association [APA], & National Council on…

  8. Test of Achievement in Quantitative Economics for Secondary Schools: Construction and Validation Using Item Response Theory

    ERIC Educational Resources Information Center

    Eleje, Lydia I.; Esomonu, Nkechi P. M.

    2018-01-01

    A Test to measure achievement in quantitative economics among secondary school students was developed and validated in this study. The test is made up 20 multiple choice test items constructed based on quantitative economics sub-skills. Six research questions guided the study. Preliminary validation was done by two experienced teachers in…

  9. Validity and Reliability of Published Comprehensive Theory of Mind Tests for Normal Preschool Children: A Systematic Review.

    PubMed

    Ziatabar Ahmadi, Seyyede Zohreh; Jalaie, Shohreh; Ashayeri, Hassan

    2015-09-01

    Theory of mind (ToM) or mindreading is an aspect of social cognition that evaluates mental states and beliefs of oneself and others. Validity and reliability are very important criteria when evaluating standard tests; and without them, these tests are not usable. The aim of this study was to systematically review the validity and reliability of published English comprehensive ToM tests developed for normal preschool children. We searched MEDLINE (PubMed interface), Web of Science, Science direct, PsycINFO, and also evidence base Medicine (The Cochrane Library) databases from 1990 to June 2015. Search strategy was Latin transcription of 'Theory of Mind' AND test AND children. Also, we manually studied the reference lists of all final searched articles and carried out a search of their references. Inclusion criteria were as follows: Valid and reliable diagnostic ToM tests published from 1990 to June 2015 for normal preschool children; and exclusion criteria were as follows: the studies that only used ToM tests and single tasks (false belief tasks) for ToM assessment and/or had no description about structure, validity or reliability of their tests. METHODological quality of the selected articles was assessed using the Critical Appraisal Skills Programme (CASP). In primary searching, we found 1237 articles in total databases. After removing duplicates and applying all inclusion and exclusion criteria, we selected 11 tests for this systematic review. There were a few valid, reliable and comprehensive ToM tests for normal preschool children. However, we had limitations concerning the included articles. The defined ToM tests were different in populations, tasks, mode of presentations, scoring, mode of responses, times and other variables. Also, they had various validities and reliabilities. Therefore, it is recommended that the researchers and clinicians select the ToM tests according to their psychometric characteristics, validity and reliability.

  10. Validity and Reliability of Published Comprehensive Theory of Mind Tests for Normal Preschool Children: A Systematic Review

    PubMed Central

    Ziatabar Ahmadi, Seyyede Zohreh; Jalaie, Shohreh; Ashayeri, Hassan

    2015-01-01

    Objective: Theory of mind (ToM) or mindreading is an aspect of social cognition that evaluates mental states and beliefs of oneself and others. Validity and reliability are very important criteria when evaluating standard tests; and without them, these tests are not usable. The aim of this study was to systematically review the validity and reliability of published English comprehensive ToM tests developed for normal preschool children. Method: We searched MEDLINE (PubMed interface), Web of Science, Science direct, PsycINFO, and also evidence base Medicine (The Cochrane Library) databases from 1990 to June 2015. Search strategy was Latin transcription of ‘Theory of Mind’ AND test AND children. Also, we manually studied the reference lists of all final searched articles and carried out a search of their references. Inclusion criteria were as follows: Valid and reliable diagnostic ToM tests published from 1990 to June 2015 for normal preschool children; and exclusion criteria were as follows: the studies that only used ToM tests and single tasks (false belief tasks) for ToM assessment and/or had no description about structure, validity or reliability of their tests. Methodological quality of the selected articles was assessed using the Critical Appraisal Skills Programme (CASP). Result: In primary searching, we found 1237 articles in total databases. After removing duplicates and applying all inclusion and exclusion criteria, we selected 11 tests for this systematic review. Conclusion: There were a few valid, reliable and comprehensive ToM tests for normal preschool children. However, we had limitations concerning the included articles. The defined ToM tests were different in populations, tasks, mode of presentations, scoring, mode of responses, times and other variables. Also, they had various validities and reliabilities. Therefore, it is recommended that the researchers and clinicians select the ToM tests according to their psychometric characteristics

  11. Tournament Validity: Testing Golfer Competence

    ERIC Educational Resources Information Center

    Sachau, Daniel; Andrews, Lance; Gibson, Bryan; DeNeui, Daniel

    2009-01-01

    The concept of tournament validity was explored in three studies. In the first study, measures of tournament validity, difficulty, and discrimination were introduced. These measures were illustrated with data from the 2003 Professional Golf Association (PGA) Tour. In the second study, the relationship between difficulty and discrimination was…

  12. Development and clinical validation of the Genedrive point-of-care test for qualitative detection of hepatitis C virus.

    PubMed

    Llibre, Alba; Shimakawa, Yusuke; Mottez, Estelle; Ainsworth, Shaun; Buivan, Tan-Phuc; Firth, Rick; Harrison, Elliott; Rosenberg, Arielle R; Meritet, Jean-François; Fontanet, Arnaud; Castan, Pablo; Madejón, Antonio; Laverick, Mark; Glass, Allison; Viana, Raquel; Pol, Stanislas; McClure, C Patrick; Irving, William Lucien; Miele, Gino; Albert, Matthew L; Duffy, Darragh

    2018-04-03

    Recently approved direct acting antivirals provide transformative therapies for chronic hepatitis C virus (HCV) infection. The major clinical challenge remains to identify the undiagnosed patients worldwide, many of whom live in low-income and middle-income countries, where access to nucleic acid testing remains limited. The aim of this study was to develop and validate a point-of-care (PoC) assay for the qualitative detection of HCV RNA. We developed a PoC assay for the qualitative detection of HCV RNA on the PCR Genedrive instrument. We validated the Genedrive HCV assay through a case-control study comparing results with those obtained with the Abbott RealTi m e HCV test. The PoC assay identified all major HCV genotypes, with a limit of detection of 2362 IU/mL (95% CI 1966 to 2788). Using 422 patients chronically infected with HCV and 503 controls negative for anti-HCV and HCV RNA, the Genedrive HCV assay showed 98.6% sensitivity (95% CI 96.9% to 99.5%) and 100% specificity (95% CI 99.3% to 100%) to detect HCV. In addition, melting peak ratiometric analysis demonstrated proof-of-principle for semiquantification of HCV. The test was further validated in a real clinical setting in a resource-limited country. We report a rapid, simple, portable and accurate PoC molecular test for HCV, with sensitivity and specificity that fulfils the recent FIND/WHO Target Product Profile for HCV decentralised testing in low-income and middle-income countries. This Genedrive HCV assay may positively impact the continuum of HCV care from screening to cure by supporting real-time treatment decisions. NCT02992184. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  13. The Validity and Responsiveness of Isometric Lower Body Multi-Joint Tests of Muscular Strength: a Systematic Review.

    PubMed

    Drake, David; Kennedy, Rodney; Wallace, Eric

    2017-12-01

    Researchers and practitioners working in sports medicine and science require valid tests to determine the effectiveness of interventions and enhance understanding of mechanisms underpinning adaptation. Such decision making is influenced by the supportive evidence describing the validity of tests within current research. The objective of this study is to review the validity of lower body isometric multi-joint tests ability to assess muscular strength and determine the current level of supporting evidence. Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines were followed in a systematic fashion to search, assess and synthesize existing literature on this topic. Electronic databases such as Web of Science, CINAHL and PubMed were searched up to 18 March 2015. Potential inclusions were screened against eligibility criteria relating to types of test, measurement instrument, properties of validity assessed and population group and were required to be published in English. The Consensus-based Standards for the Selection of health Measurement Instruments (COSMIN) checklist was used to assess methodological quality and measurement property rating of included studies. Studies rated as fair or better in methodological quality were included in the best evidence synthesis. Fifty-nine studies met the eligibility criteria for quality appraisal. The ten studies that rated fair or better in methodological quality were included in the best evidence synthesis. The most frequently investigated lower body isometric multi-joint tests for validity were the isometric mid-thigh pull and isometric squat. The validity of each of these tests was strong in terms of reliability and construct validity. The evidence for responsiveness of tests was found to be moderate for the isometric squat test and unknown for the isometric mid-thigh pull. No tests using the isometric leg press met the criteria for inclusion in the best evidence synthesis. Researchers and

  14. Understanding Student Teachers' Behavioural Intention to Use Technology: Technology Acceptance Model (TAM) Validation and Testing

    ERIC Educational Resources Information Center

    Wong, Kung-Teck; Osman, Rosma bt; Goh, Pauline Swee Choo; Rahmat, Mohd Khairezan

    2013-01-01

    This study sets out to validate and test the Technology Acceptance Model (TAM) in the context of Malaysian student teachers' integration of their technology in teaching and learning. To establish factorial validity, data collected from 302 respondents were tested against the TAM using confirmatory factor analysis (CFA), and structural equation…

  15. Ride qualities criteria validation/pilot performance study: Flight test results

    NASA Technical Reports Server (NTRS)

    Nardi, L. U.; Kawana, H. Y.; Greek, D. C.

    1979-01-01

    Pilot performance during a terrain following flight was studied for ride quality criteria validation. Data from manual and automatic terrain following operations conducted during low level penetrations were analyzed to determine the effect of ride qualities on crew performance. The conditions analyzed included varying levels of turbulence, terrain roughness, and mission duration with a ride smoothing system on and off. Limited validation of the B-1 ride quality criteria and some of the first order interactions between ride qualities and pilot/vehicle performance are highlighted. An earlier B-1 flight simulation program correlated well with the flight test results.

  16. Laboratory compliance with the American Society of Clinical Oncology/College of American Pathologists human epidermal growth factor receptor 2 testing guidelines: a 3-year comparison of validation procedures.

    PubMed

    Dyhdalo, Kathryn S; Fitzgibbons, Patrick L; Goldsmith, Jeffery D; Souers, Rhona J; Nakhleh, Raouf E

    2014-07-01

    The American Society of Clinical Oncology/College of American Pathologists (ASCO/CAP) published guidelines in 2007 regarding testing accuracy, interpretation, and reporting of results for HER2 studies. A 2008 survey identified areas needing improved compliance. To reassess laboratory response to those guidelines following a full accreditation cycle for an updated snapshot of laboratory practices regarding ASCO/CAP guidelines. In 2011, a survey was distributed with the HER2 immunohistochemistry (IHC) proficiency testing program identical to the 2008 survey. Of the 1150 surveys sent, 977 (85.0%) were returned, comparable to the original survey response in 2008 (757 of 907; 83.5%). New participants submitted 124 of 977 (12.7%) surveys. The median laboratory accession rate was 14,788 cases with 211 HER2 tests performed annually. Testing was validated with fluorescence in situ hybridization in 49.1% (443 of 902) of the laboratories; 26.3% (224 of 853) of the laboratories used another IHC assay. The median number of cases to validate fluorescence in situ hybridization (n = 40) and IHC (n = 27) was similar to those in 2008. Ninety-five percent concordance with fluorescence in situ hybridization was achieved by 76.5% (254 of 332) of laboratories for IHC(-) findings and 70.4% (233 of 331) for IHC(+) cases. Ninety-five percent concordance with another IHC assay was achieved by 71.1% (118 of 168) of the laboratories for negative findings and 69.6% (112 of 161) of the laboratories for positive cases. The proportion of laboratories interpreting HER2 IHC using ASCO/CAP guidelines (86.6% [798 of 921] in 2011; 83.8% [605 of 722] in 2008) remains similar. Although fixation time improvements have been made, assay validation deficiencies still exist. The results of this survey were shared within the CAP, including the Laboratory Accreditation Program and the ASCO/CAP panel revising the HER2 guidelines published in October 2013. The Laboratory Accreditation Program checklist was

  17. Naturalistic validation of an on-road driving test of older drivers.

    PubMed

    Ott, Brian R; Papandonatos, George D; Davis, Jennifer D; Barco, Peggy P

    2012-08-01

    The objective was to compare a standardized road test to naturalistic driving by older people who may have cognitive impairment to define improvements that could potentially enhance the validity of road testing in this population. Road testing has been widely adapted as a tool to assess driving competence of older people who may be at risk for unsafe driving because of dementia; however, the validity of this approach has not been rigorously evaluated. For 2 weeks, 80 older drivers (38 healthy elders and 42 with cognitive impairment) who passed a standardized road test were video recorded in their own vehicles. Using a standardized rating scale, 4 hr of video was rated by a driving instructor. The authors examine weighting of individual road test items to form global impressions and to compare road test and naturalistic driving using factor analyses of these two assessments. The road test score was unidimensional, reflecting a major factor related to awareness of signage and traffic behavior. Naturalistic driving reflected two factors related to lane keeping as well as traffic behavior. Maintenance of proper lane is an important dimension of driving safety that appears to be relatively underemphasized during the highly supervised procedures of the standardized road test. Road testing in this population could be improved by standardized designs that emphasize lane keeping and that include self-directed driving. Additional information should be sought from observers in the community as well as crash evidence when advising older drivers who may be cognitively impaired.

  18. A Test Case for the Source Inversion Validation: The 2014 ML 5.5 Orkney, South Africa Earthquake

    NASA Astrophysics Data System (ADS)

    Ellsworth, W. L.; Ogasawara, H.; Boettcher, M. S.

    2017-12-01

    The ML5.5 earthquake of August 5, 2014 occurred on a near-vertical strike slip fault below abandoned and active gold mines near Orkney, South Africa. A dense network of surface and in-mine seismometers recorded the earthquake and its aftershock sequence. In-situ stress measurements and rock samples through the damage zone and rupture surface are anticipated to be available from the "Drilling into Seismogenic Zones of M2.0-M5.5 Earthquakes in South African gold mines" project (DSeis) that is currently progressing toward the rupture zone (Science, doi: 10.1126/science.aan6905). As of 24 July, 95% of drilled core has been recovered from a 427m-section of the 1st hole from 2.9 km depth with minimal core discing and borehole breakouts. A 2nd hole is planned to intersect the fault at greater depth. Absolute differential stress will be measured along the holes and frictional characteristics of the recovered core will be determined in the lab. Surface seismic reflection data and exploration drilling from the surface down to the mining horizon at 3km depth is also available to calibrate the velocity structure above the mining horizon and image reflective geological boundaries and major faults below the mining horizon. The remarkable quality and range of geophysical data available for the Orkney earthquake makes this event an ideal test case for the Source Inversion Validation community using actual seismic data to determine the spatial and temporal evolution of earthquake rupture. We invite anyone with an interest in kinematic modeling to develop a rupture model for the Orkney earthquake. Seismic recordings of the earthquake and information on the faulting geometry can be found in Moyer et al. (2017, doi: 10.1785/0220160218). A workshop supported by the Southern California Earthquake Center will be held in the spring of 2018 to compare kinematic models. Those interested in participating in the modeling exercise and the workshop should contact the authors for additional

  19. Formal methods for test case generation

    NASA Technical Reports Server (NTRS)

    Rushby, John (Inventor); De Moura, Leonardo Mendonga (Inventor); Hamon, Gregoire (Inventor)

    2011-01-01

    The invention relates to the use of model checkers to generate efficient test sets for hardware and software systems. The method provides for extending existing tests to reach new coverage targets; searching *to* some or all of the uncovered targets in parallel; searching in parallel *from* some or all of the states reached in previous tests; and slicing the model relative to the current set of coverage targets. The invention provides efficient test case generation and test set formation. Deep regions of the state space can be reached within allotted time and memory. The approach has been applied to use of the model checkers of SRI's SAL system and to model-based designs developed in Stateflow. Stateflow models achieving complete state and transition coverage in a single test case are reported.

  20. Flight Testing an Iced Business Jet for Flight Simulation Model Validation

    NASA Technical Reports Server (NTRS)

    Ratvasky, Thomas P.; Barnhart, Billy P.; Lee, Sam; Cooper, Jon

    2007-01-01

    A flight test of a business jet aircraft with various ice accretions was performed to obtain data to validate flight simulation models developed through wind tunnel tests. Three types of ice accretions were tested: pre-activation roughness, runback shapes that form downstream of the thermal wing ice protection system, and a wing ice protection system failure shape. The high fidelity flight simulation models of this business jet aircraft were validated using a software tool called "Overdrive." Through comparisons of flight-extracted aerodynamic forces and moments to simulation-predicted forces and moments, the simulation models were successfully validated. Only minor adjustments in the simulation database were required to obtain adequate match, signifying the process used to develop the simulation models was successful. The simulation models were implemented in the NASA Ice Contamination Effects Flight Training Device (ICEFTD) to enable company pilots to evaluate flight characteristics of the simulation models. By and large, the pilots confirmed good similarities in the flight characteristics when compared to the real airplane. However, pilots noted pitch up tendencies at stall with the flaps extended that were not representative of the airplane and identified some differences in pilot forces. The elevator hinge moment model and implementation of the control forces on the ICEFTD were identified as a driver in the pitch ups and control force issues, and will be an area for future work.

  1. Validity threats: overcoming interference with proposed interpretations of assessment data.

    PubMed

    Downing, Steven M; Haladyna, Thomas M

    2004-03-01

    Factors that interfere with the ability to interpret assessment scores or ratings in the proposed manner threaten validity. To be interpreted in a meaningful manner, all assessments in medical education require sound, scientific evidence of validity. The purpose of this essay is to discuss 2 major threats to validity: construct under-representation (CU) and construct-irrelevant variance (CIV). Examples of each type of threat for written, performance and clinical performance examinations are provided. The CU threat to validity refers to undersampling the content domain. Using too few items, cases or clinical performance observations to adequately generalise to the domain represents CU. Variables that systematically (rather than randomly) interfere with the ability to meaningfully interpret scores or ratings represent CIV. Issues such as flawed test items written at inappropriate reading levels or statistically biased questions represent CIV in written tests. For performance examinations, such as standardised patient examinations, flawed cases or cases that are too difficult for student ability contribute CIV to the assessment. For clinical performance data, systematic rater error, such as halo or central tendency error, represents CIV. The term face validity is rejected as representative of any type of legitimate validity evidence, although the fact that the appearance of the assessment may be an important characteristic other than validity is acknowledged. There are multiple threats to validity in all types of assessment in medical education. Methods to eliminate or control validity threats are suggested.

  2. Test-retest reliability and validity of the Sniffin' TOM odor memory test.

    PubMed

    Croy, Ilona; Zehner, Cora; Larsson, Maria; Zucco, Gesualdo M; Hummel, Thomas

    2015-03-01

    Few attempts have been made to develop an olfactory test that captures episodic retention of olfactory information. Assessment of episodic odor memory is of particular interest in aging and in the cognitively impaired as both episodic memory deficits and olfactory loss have been targeted as reliable hallmarks of cognitive decline and impending dementia. Here, 96 healthy participants (18-92 years) and an additional 19 older people with mild cognitive impairment were tested (73-82 years). Participants were presented with 8 common odors with intentional encoding instructions that were followed by a yes-no recognition test. After recognition completion, participants were asked to identify all odors by means of free or cued identification. A retest of the odor memory test (Sniffin' TOM = test of odor memory) took place 17 days later. The results revealed satisfactory test-retest reliability (0.70) of odor recognition memory. Both recognition and identification performance were negatively affected by age and more pronounced among the cognitively impaired. In conclusion, the present work presents a reliable, valid, and simple test of episodic odor recognition memory that may be used in clinical groups where both episodic memory deficits and olfactory loss are prevalent preclinically such as Alzheimer's disease. © The Author 2014. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  3. Testing and Validation of the Dynamic Inertia Measurement Method

    NASA Technical Reports Server (NTRS)

    Chin, Alexander W.; Herrera, Claudia Y.; Spivey, Natalie D.; Fladung, William A.; Cloutier, David

    2015-01-01

    The Dynamic Inertia Measurement (DIM) method uses a ground vibration test setup to determine the mass properties of an object using information from frequency response functions. Most conventional mass properties testing involves using spin tables or pendulum-based swing tests, which for large aerospace vehicles becomes increasingly difficult and time-consuming, and therefore expensive, to perform. The DIM method has been validated on small test articles but has not been successfully proven on large aerospace vehicles. In response, the National Aeronautics and Space Administration Armstrong Flight Research Center (Edwards, California) conducted mass properties testing on an "iron bird" test article that is comparable in mass and scale to a fighter-type aircraft. The simple two-I-beam design of the "iron bird" was selected to ensure accurate analytical mass properties. Traditional swing testing was also performed to compare the level of effort, amount of resources, and quality of data with the DIM method. The DIM test showed favorable results for the center of gravity and moments of inertia; however, the products of inertia showed disagreement with analytical predictions.

  4. Reliability and Validity of the Korean Version of the Internet Addiction Test among College Students

    PubMed Central

    Lee, Kounseok; Lee, Hye-Kyung; Gyeong, Hyunsu; Yu, Byeongkwan; Song, Yul-Mai

    2013-01-01

    We developed a Korean translation of the Internet Addiction Test (KIAT), widely used self-report for internet addiction and tested its reliability and validity in a sample of college students. Two hundred seventy-nine college students at a national university completed the KIAT. Internal consistency and two week test-retest reliability were calculated from the data, and principal component factor analysis was conducted. Participants also completed the Internet Addiction Diagnostic Questionnaire (IADQ), the Korea Internet addiction scale (K-scale), and the Patient Health Questionnaire-9 for the criterion validity. Cronbach's alpha of the whole scale was 0.91, and test-retest reliability was also good (r = 0.73). The IADQ, the K-scale, and depressive symptoms were significantly correlated with the KIAT scores, demonstrating concurrent and convergent validity. The factor analysis extracted four factors (Excessive use, Dependence, Withdrawal, and Avoidance of reality) that accounted for 59% of total variance. The KIAT has outstanding internal consistency and high test-retest reliability. Also, the factor structure and validity data show that the KIAT is comparable to the original version. Thus, the KIAT is a psychometrically sound tool for assessing internet addiction in the Korean-speaking population. PMID:23678270

  5. Development and validation of an energy-balance knowledge test for fourth- and fifth-grade students.

    PubMed

    Chen, Senlin; Zhu, Xihe; Kang, Minsoo

    2017-05-01

    A valid test measuring children's energy-balance (EB) knowledge is lacking in research. This study developed and validated the energy-balance knowledge test (EBKT) for fourth and fifth grade students. The original EBKT contained 25 items but was reduced to 23 items based on pilot result and intensive expert panel discussion. De-identified data were collected from 468 fourth and fifth grade students enrolled in four schools to examine the psychometric properties of the EBKT items. The Rasch model analysis was conducted using the Winstep 3.65.0 software. Differential item functioning (DIF) analysis flagged 1 item (item #4) functioning differently between boys and girls, which was deleted. The final 22-item EBKT showed desirable model-data fit indices. The items had large variability ranging from -3.58 logit (item #10, the easiest) to 1.70 logit (item #3, the hardest). The average person ability on the test was 0.28 logit (SD = .78). Additional analyses supported known-group difference validity of the EBKT scores in capturing gender- and grade-based ability differences. The test was overall valid but could be further improved by expanding test items to discern various ability levels. For lack of a better test, researchers and practitioners may use the EBKT to assess fourth- and fifth-grade students' EB knowledge.

  6. Test-retest reliability and cross validation of the functioning everyday with a wheelchair instrument.

    PubMed

    Mills, Tamara L; Holm, Margo B; Schmeler, Mark

    2007-01-01

    The purpose of this study was to establish the test-retest reliability and content validity of an outcomes tool designed to measure the effectiveness of seating-mobility interventions on the functional performance of individuals who use wheelchairs or scooters as their primary seating-mobility device. The instrument, Functioning Everyday With a Wheelchair (FEW), is a questionnaire designed to measure perceived user function related to wheelchair/scooter use. Using consumer-generated items, FEW Beta Version 1.0 was developed and test-retest reliability was established. Cross-validation of FEW Beta Version 1.0 was then carried out with five samples of seating-mobility users to establish content validity. Based on the content validity study, FEW Version 2.0 was developed and administered to seating-mobility consumers to examine its test-retest reliability. FEW Beta Version 1.0 yielded an intraclass correlation coefficient (ICC) Model (3,k) of .92, p < .001, and the content validity results revealed that FEW Beta Version 1.0 captured 55% of seating-mobility goals reported by consumers across five samples. FEW Version 2.0 yielded ICC(3,k) = .86, p < .001, and captured 98.5% of consumers' seating-mobility goals. The cross-validation study identified new categories of seating-mobility goals for inclusion in FEW Version 2.0, and the content validity of FEW Version 2.0 was confirmed. FEW Beta Version 1.0 and FEW Version 2.0 were highly stable in their measurement of participants' seating-mobility goals over a 1-week interval.

  7. Validation testing of shallow notched round-bar screening test specimens. [for the space shuttle main engine

    NASA Technical Reports Server (NTRS)

    Vroman, G. A.

    1975-01-01

    The capability of shallow-notched, round-bar, tensile specimens for screening critical environments as they affect the material fracture properties of the space shuttle main engine was tested and analyzed. Specimens containing a 0.050-inch-deep circumferential sharp notch were cyclically loaded in a 5000-psi hydrogen environment at temperatures of +70 and -15 F. Replication of test results and a marked change in cyclic life because of temperature variation demonstrated the validity of the specimen type to be utilized for screening tests.

  8. [Reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test].

    PubMed

    Zhang, C; Yang, G P; Li, Z; Li, X N; Li, Y; Hu, J; Zhang, F Y; Zhang, X J

    2017-08-10

    Objective: To assess the reliability and validity of the Chinese version on Alcohol Use Disorders Identification Test (AUDIT) among medical students in China and to provide correct way of application on the recommended scales. Methods: An E-questionnaire was developed and sent to medical students in five different colleges. Students were all active volunteers to accept the testings. Cronbach's α and split-half reliability were calculated to evaluate the reliability of AUDIT while content, contract, discriminant and convergent validity were performed to measure the validity of the scales. Results: The overall Cronbach's α of AUDIT was 0.782 and the split-half reliability was 0.711. Data showed that the domain Cronbach's α and split-half reliability were 0.796 and 0.794 for hazardous alcohol use, 0.561 and 0.623 for dependence symptoms, and 0.647 and 0.640 for harmful alcohol use. Results also showed that the content validity index on the levels of items I-CVI) were from 0.83 to 1.00, the content validity index of scale level (S-CVI/UA) was 0.90, content validity index of average scale level (S-CVI/Ave) was 0.99 and the content validity ratios (CVR) were from 0.80 to 1.00. The simplified version of AUDIT supported a presupposed three-factor structure which could explain 61.175% of the total variance revealed through exploratory factor analysis. AUDIT semed to have good convergent and discriminant validity, with the success rate of calibration experiment as 100%. Conclusion: AUDIT showed good reliability and validity among medical students in China thus worth for promotion on its use.

  9. Development of a framework for international certification by the OIE of diagnostic tests validated as fit for purpose.

    PubMed

    Wright, P; Edwards, S; Diallo, A; Jacobson, R

    2007-01-01

    Historically, the OIE has focussed on test methods applicable to trade and the international movement of animals and animal products. With its expanding role as the World Organisation for Animal Health, the OIE has recognised the need to evaluate test methods relative to specific diagnostic applications other than trade. In collaboration with its international partners, the OIE solicited input from experts through consultants meetings on the development of guidelines for validation and certification of diagnostic assays for infectious animal diseases. Recommendations from the first meeting were formally adopted and have subsequently been acted upon by the OIE. A validation template has been developed that specifically requires a test to be fit or suited for its intended purpose (e.g. as a screening or a confirmatory test). This is a key criterion for validation. The template incorporates four distinct stages of validation, each of which has bearing on the evaluation of fitness for purpose. The OIE has just recently created a registry for diagnostic tests that fulfil these validation requirements. Assay developers are invited to submit validation dossiers to the OIE for evaluation by a panel of experts. Recognising that validation is an incremental process, tests methods achieving at least the first stages of validation may be provisionally accepted. To provide additional confidence in assay performance, the OIE, through its network of Reference Laboratories, has embarked on the development of evaluation panels. These panels would contain specially selected test samples that would assist in verifying fitness for purpose.

  10. A Metric-Based Validation Process to Assess the Realism of Synthetic Power Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Birchfield, Adam; Schweitzer, Eran; Athari, Mir

    Public power system test cases that are of high quality benefit the power systems research community with expanded resources for testing, demonstrating, and cross-validating new innovations. Building synthetic grid models for this purpose is a relatively new problem, for which a challenge is to show that created cases are sufficiently realistic. This paper puts forth a validation process based on a set of metrics observed from actual power system cases. These metrics follow the structure, proportions, and parameters of key power system elements, which can be used in assessing and validating the quality of synthetic power grids. Though wide diversitymore » exists in the characteristics of power systems, the paper focuses on an initial set of common quantitative metrics to capture the distribution of typical values from real power systems. The process is applied to two new public test cases, which are shown to meet the criteria specified in the metrics of this paper.« less

  11. A Metric-Based Validation Process to Assess the Realism of Synthetic Power Grids

    DOE PAGES

    Birchfield, Adam; Schweitzer, Eran; Athari, Mir; ...

    2017-08-19

    Public power system test cases that are of high quality benefit the power systems research community with expanded resources for testing, demonstrating, and cross-validating new innovations. Building synthetic grid models for this purpose is a relatively new problem, for which a challenge is to show that created cases are sufficiently realistic. This paper puts forth a validation process based on a set of metrics observed from actual power system cases. These metrics follow the structure, proportions, and parameters of key power system elements, which can be used in assessing and validating the quality of synthetic power grids. Though wide diversitymore » exists in the characteristics of power systems, the paper focuses on an initial set of common quantitative metrics to capture the distribution of typical values from real power systems. The process is applied to two new public test cases, which are shown to meet the criteria specified in the metrics of this paper.« less

  12. 40 CFR 1045.501 - How do I run a valid emission test?

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 34 2013-07-01 2013-07-01 false How do I run a valid emission test? 1045.501 Section 1045.501 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM SPARK-IGNITION PROPULSION MARINE ENGINES AND VESSELS Test...

  13. 40 CFR 1045.501 - How do I run a valid emission test?

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 33 2014-07-01 2014-07-01 false How do I run a valid emission test? 1045.501 Section 1045.501 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM SPARK-IGNITION PROPULSION MARINE ENGINES AND VESSELS Test...

  14. 40 CFR 1045.501 - How do I run a valid emission test?

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 40 Protection of Environment 32 2010-07-01 2010-07-01 false How do I run a valid emission test? 1045.501 Section 1045.501 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM SPARK-IGNITION PROPULSION MARINE ENGINES AND VESSELS Test...

  15. 40 CFR 1045.501 - How do I run a valid emission test?

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 40 Protection of Environment 33 2011-07-01 2011-07-01 false How do I run a valid emission test? 1045.501 Section 1045.501 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM SPARK-IGNITION PROPULSION MARINE ENGINES AND VESSELS Test...

  16. 40 CFR 1045.501 - How do I run a valid emission test?

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 34 2012-07-01 2012-07-01 false How do I run a valid emission test? 1045.501 Section 1045.501 Protection of Environment ENVIRONMENTAL PROTECTION AGENCY (CONTINUED) AIR POLLUTION CONTROLS CONTROL OF EMISSIONS FROM SPARK-IGNITION PROPULSION MARINE ENGINES AND VESSELS Test...

  17. Validity of the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Edition

    ERIC Educational Resources Information Center

    Peters, Christine; Kranzler, John H.; Rossen, Eric

    2009-01-01

    This study examines the criterion-related validity evidence of scores on the Mayer-Salovey-Caruso Emotional Intelligence Test: Youth Version-Research Version. The authors also investigate the relationship between scores on the MSCEIT-YV and chronological age. Results provide initial support for the construct validity of the MSCEIT-YV but also…

  18. Testing the validity and acceptability of the diagnostic criteria for Hoarding Disorder: a DSM-5 survey.

    PubMed

    Mataix-Cols, D; Fernández de la Cruz, L; Nakao, T; Pertusa, A

    2011-12-01

    The DSM-5 Obsessive-Compulsive Spectrum Sub-Workgroup is recommending the creation of a new diagnostic category named Hoarding Disorder (HD). The validity and acceptability of the proposed diagnostic criteria have yet to be formally tested. Obsessive-compulsive disorder/hoarding experts and random members of the American Psychiatric Association (APA) were shown eight brief clinical vignettes (four cases meeting criteria for HD, three with hoarding behaviour secondary to other mental disorders, and one with subclinical hoarding behaviour) and asked to decide the most appropriate diagnosis in each case. Participants were also asked about the perceived acceptability of the criteria and whether they supported the inclusion of HD in the main manual. Altogether, 211 experts and 48 APA members completed the survey (30% and 10% response rates, respectively). The sensitivity and specificity of the HD diagnosis and the individual criteria were high (80-90%) across various types of professionals, irrespective of their experience with hoarding cases. About 90% of participants in both samples thought the criteria would be very/somewhat acceptable for professionals and sufferers. Most experts (70%) supported the inclusion of HD in the main manual, whereas only 50% of the APA members did. The proposed criteria for HD have high sensitivity and specificity. The criteria are also deemed acceptable for professionals and sufferers alike. Training of professionals and the development and validation of semi-structured diagnostic instruments should improve diagnostic accuracy even further. A field trial is now needed to confirm these encouraging findings with real patients in real clinical settings.

  19. Flight test techniques for validating simulated nuclear electromagnetic pulse aircraft responses

    NASA Technical Reports Server (NTRS)

    Winebarger, R. M.; Neely, W. R., Jr.

    1984-01-01

    An attempt has been made to determine the effects of nuclear EM pulses (NEMPs) on aircraft systems, using a highly instrumented NASA F-106B to document the simulated NEMP environment at the Kirtland Air Force Base's Vertically Polarized Dipole test facility. Several test positions were selected so that aircraft orientation relative to the test facility would be the same in flight as when on the stationary dielectric stand, in order to validate the dielectric stand's use in flight configuration simulations. Attention is given to the flight test portions of the documentation program.

  20. Convergent and diagnostic validity of STAVUX, a word and pseudoword spelling test for adults.

    PubMed

    Östberg, Per; Backlund, Charlotte; Lindström, Emma

    2016-10-01

    Few comprehensive spelling tests are available in Swedish, and none have been validated in adults with reading and writing disorders. The recently developed STAVUX test includes word and pseudoword spelling subtests with high internal consistency and adult norms stratified by education. This study evaluated the convergent and diagnostic validity of STAVUX in adults with dyslexia. Forty-six adults, 23 with dyslexia and 23 controls, took STAVUX together with a standard word-decoding test and a self-rated measure of spelling skills. STAVUX subtest scores showed moderate to strong correlations with word-decoding scores and predicted self-rated spelling skills. Word and pseudoword subtest scores both predicted dyslexia status. Receiver-operating characteristic (ROC) analysis showed excellent diagnostic discriminability. Sensitivity was 91% and specificity 96%. In conclusion, the results of this study support the convergent and diagnostic validity of STAVUX.

  1. [Dichotic digit test. Case].

    PubMed

    Zenker Castro, Franz; Fernández Belda, Rafael; Barajas de Prat, José Juan

    2008-12-01

    In this study we present a case of a 71-year-old female patient with sensorineural hearing loss and fitted with bilateral hearing aids. The patient complained of scant benefit from the hearing aid fitting with difficulties in understanding speech with background noise. The otolaryngology examination was normal. Audiological tests revealed bilateral sensorineural hearing loss with threshold values of 51 and 50 dB HL in the right and left ear. The Dichotic Digit Test was administered in a divided attention mode and focalizing the attention to each ear. Results in this test are consistent with a Central Auditory Processing Disorder.

  2. The Validity and Clinical Uses of the Pepper Visual Skills for Reading Test.

    ERIC Educational Resources Information Center

    Watson, G.; And Others

    1990-01-01

    The Pepper Visual Skills for Reading Test was assessed as a measure of reading ability with meaningful text in 38 adults with macular degeneration; scores were compared with assessment made using the Gray Oral Reading Test, a previously standardized assessment. The test's validity was confirmed. (Author/JDD)

  3. Test-retest reliability, smallest real difference and concurrent validity of six different balance tests on young people with mild to moderate intellectual disability.

    PubMed

    Blomqvist, Sven; Wester, Anita; Sundelin, Gunnevi; Rehn, Börje

    2012-12-01

    Some studies have reported that people with intellectual disability may have reduced balance ability compared with the population in general. However, none of these studies involved adolescents, and the reliability and validity of balance tests in this population are not known. The purpose of this study was to examine the reliability of six different balance tests and to investigate their concurrent validity. Test-retest reliability assessment. All subjects were recruited from a special school for people with intellectual disability in Bollnäs, Sweden. Eighty-nine adolescents (35 females and 54 males) with mild to moderate intellectual disability with a mean age of 18 years (range 16 to 20 years). All subjects followed the same test protocol on two occasions within an 11-day period. Balance test performances. Intraclass correlation coefficients greater than 0.80 were achieved for four of the balance tests: Extended Timed Up and Go Test, Modified Functional Reach Test, One-leg Stance Test and Force Platform Test. The smallest real differences ranged from 12% to 40%; less than 20% is considered to be low. Concurrent validity among these balance tests varied between no and low correlation. The results indicate that these tests could be used to evaluate changes in balance ability over time in people with mild to moderate intellectual disability. The low concurrent validity illustrates the importance of knowing more about the influence of various sensory subsystems that are significant for balance among adolescents with intellectual disability. Copyright © 2011 Chartered Society of Physiotherapy. Published by Elsevier Ltd. All rights reserved.

  4. Failures in Hybrid Microcircuits During Environmental Testing. History Cases

    NASA Technical Reports Server (NTRS)

    Teverovsky, Alexander

    2008-01-01

    This purpose of this viewgraph presentation is to discuss failures in hermetic hybrids observed at the GSFC PA Lab during environmental stress testing. The cases discussed are: Case I. Substrate metallization failures during Thermal cycling (TC). Case II. Flex lid-induced failure. Case Ill. Hermeticity failures during TC. Case IV. Die metallization cracking during TC. and how many test cycles and parts is necessary? Case V. Wire Bond failures after life test. Case VI. Failures caused by Au/In IMC growth.

  5. Glossary of reference terms for alternative test methods and their validation.

    PubMed

    Ferrario, Daniele; Brustio, Roberta; Hartung, Thomas

    2014-01-01

    This glossary was developed to provide technical references to support work in the field of the alternatives to animal testing. It was compiled from various existing reference documents coming from different sources and is meant to be a point of reference on alternatives to animal testing. Giving the ever-increasing number of alternative test methods and approaches being developed over the last decades, a combination, revision, and harmonization of earlier published collections of terms used in the validation of such methods is required. The need to update previous glossary efforts came from the acknowledgement that new words have emerged with the development of new approaches, while others have become obsolete, and the meaning of some terms has partially changed over time. With this glossary we intend to provide guidance on issues related to the validation of new or updated testing methods consistent with current approaches. Moreover, because of new developments and technologies, a glossary needs to be a living, constantly updated document. An Internet-based version based on this compilation may be found at http://altweb.jhsph.edu/, allowing the addition of new material.

  6. Development, validity, and reliability of a ballet-specific aerobic fitness test.

    PubMed

    Twitchett, Emily; Nevill, Alan; Angioi, Manuela; Koutedakis, Yiannis; Wyon, Matthew

    2011-09-01

    The aim of this study was to develop and assess the reliability and validity of a multi-stage, ballet-specific aerobic fitness test to be used in a dance studio setting. The test consists of five stages, each four minutes long, that increase in intensity. It uses classical ballet movement of an intermediate-level of difficulty, thus emphasizing physiological demand rather than skill. The demand of each stage was determined by calculating the mean oxygen uptake during its final minute using a portable gas analyser. After an initial familiarization period, eight female subjects performed the test twice within seven days. The results showed significant differences in oxygen consumption between stages (p < 0.001), but not between trials. Pearson correlation co-efficients produced a very good linear relationship between trials (r = 0.998, p < 0.001). Bland-Altman reliability analysis revealed the 95% limits of agreement to be ± 6.2 ml·kg(-1)·min(-1), showing good agreement between trials. The oxygen uptake in our subjects equated positively to previous estimates for class and performance, confirming validity. It was concluded that the test is suitable for use among classical ballet dancers, with many possible applications.

  7. Testing the Construct Validity of a Virtual Reality Hip Arthroscopy Simulator.

    PubMed

    Khanduja, Vikas; Lawrence, John E; Audenaert, Emmanuel

    2017-03-01

    To test the construct validity of the hip diagnostics module of a virtual reality hip arthroscopy simulator. Nineteen orthopaedic surgeons performed a simulated arthroscopic examination of a healthy hip joint using a 70° arthroscope in the supine position. Surgeons were categorized as either expert (those who had performed 250 hip arthroscopies or more) or novice (those who had performed fewer than this). Twenty-one specific targets were visualized within the central and peripheral compartments; 9 via the anterior portal, 9 via the anterolateral portal, and 3 via the posterolateral portal. This was immediately followed by a task testing basic probe examination of the joint in which a series of 8 targets were probed via the anterolateral portal. During the tasks, the surgeon's performance was evaluated by the simulator using a set of predefined metrics including task duration, number of soft tissue and bone collisions, and distance travelled by instruments. No repeat attempts at the tasks were permitted. Construct validity was then evaluated by comparing novice and expert group performance metrics over the 2 tasks using the Mann-Whitney test, with a P value of less than .05 considered significant. On the visualization task, the expert group outperformed the novice group on time taken (P = .0003), number of collisions with soft tissue (P = .001), number of collisions with bone (P = .002), and distance travelled by the arthroscope (P = .02). On the probe examination, the 2 groups differed only in the time taken to complete the task (P = .025) with no significant difference in other metrics. Increased experience in hip arthroscopy was reflected by significantly better performance on the virtual reality simulator across 2 tasks, supporting its construct validity. This study validates a virtual reality hip arthroscopy simulator and supports its potential for developing basic arthroscopic skills. Level III. Copyright © 2016 Arthroscopy Association of North America

  8. Finding Kids with Special Needs: the Background, Development, Field Test and Validation.

    ERIC Educational Resources Information Center

    Resource Management Systems, Inc., Carmel, CA.

    Described are the development of "Findings Kids with Special Needs" (FKSN), a instrument to identify children's learning problems and gifted students; results of field testing with 24,825 children, kindergarten through grade 8, in 110 schools; and validation procedures. Discussed is test construction, including incorporation of 12…

  9. Naturalistic Validation of an On-Road Driving Test of Older Drivers

    PubMed Central

    Ott, Brian R.; Papandonatos, George D.; Davis, Jennifer D.; Barco, Peggy P.

    2013-01-01

    Objective The objective was to compare a standardized road test to naturalistic driving by older people who may have cognitive impairment to define improvements that could potentially enhance the validity of road testing in this population. Background Road testing has been widely adapted as a tool to assess driving competence of older people who may be at risk for unsafe driving because of dementia; however, the validity of this approach has not been rigorously evaluated. Method For 2 weeks, 80 older drivers (38 healthy elders and 42 with cognitive impairment) who passed a standardized road test were video recorded in their own vehicles. Using a standardized rating scale, 4 hr of video was rated by a driving instructor. The authors examine weighting of individual road test items to form global impressions and to compare road test and naturalistic driving using factor analyses of these two assessments. Results The road test score was unidimensional, reflecting a major factor related to awareness of signage and traffic behavior. Naturalistic driving reflected two factors related to lane keeping as well as traffic behavior. Conclusion Maintenance of proper lane is an important dimension of driving safety that appears to be relatively underemphasized during the highly supervised procedures of the standardized road test. Application Road testing in this population could be improved by standardized designs that emphasize lane keeping and that include self-directed driving. Additional information should be sought from observers in the community as well as crash evidence when advising older drivers who may be cognitively impaired. PMID:22908688

  10. Integrated Application of Active Controls (IAAC) technology to an advanced subsonic transport project: Test act system validation

    NASA Technical Reports Server (NTRS)

    1985-01-01

    The primary objective of the Test Active Control Technology (ACT) System laboratory tests was to verify and validate the system concept, hardware, and software. The initial lab tests were open loop hardware tests of the Test ACT System as designed and built. During the course of the testing, minor problems were uncovered and corrected. Major software tests were run. The initial software testing was also open loop. These tests examined pitch control laws, wing load alleviation, signal selection/fault detection (SSFD), and output management. The Test ACT System was modified to interface with the direct drive valve (DDV) modules. The initial testing identified problem areas with DDV nonlinearities, valve friction induced limit cycling, DDV control loop instability, and channel command mismatch. The other DDV issue investigated was the ability to detect and isolate failures. Some simple schemes for failure detection were tested but were not completely satisfactory. The Test ACT System architecture continues to appear promising for ACT/FBW applications in systems that must be immune to worst case generic digital faults, and be able to tolerate two sequential nongeneric faults with no reduction in performance. The challenge in such an implementation would be to keep the analog element sufficiently simple to achieve the necessary reliability.

  11. Empirical Derivation and Validation of a Clinical Case Definition for Neuropsychological Impairment in Children and Adolescents.

    PubMed

    Beauchamp, Miriam H; Brooks, Brian L; Barrowman, Nick; Aglipay, Mary; Keightley, Michelle; Anderson, Peter; Yeates, Keith O; Osmond, Martin H; Zemek, Roger

    2015-09-01

    Neuropsychological assessment aims to identify individual performance profiles in multiple domains of cognitive functioning; however, substantial variation exists in how deficits are defined and what cutoffs are used, and there is no universally accepted definition of neuropsychological impairment. The aim of this study was to derive and validate a clinical case definition rule to identify neuropsychological impairment in children and adolescents. An existing normative pediatric sample was used to calculate base rates of abnormal functioning on eight measures covering six domains of neuropsychological functioning. The dataset was analyzed by varying the range of cutoff levels [1, 1.5, and 2 standard deviations (SDs) below the mean] and number of indicators of impairment. The derived rule was evaluated by bootstrap, internal and external clinical validation (orthopedic and traumatic brain injury). Our neuropsychological impairment (NPI) rule was defined as "two or more test scores that fall 1.5 SDs below the mean." The rule identifies 5.1% of the total sample as impaired in the assessment battery and consistently targets between 3 and 7% of the population as impaired even when age, domains, and number of tests are varied. The NPI rate increases in groups known to exhibit cognitive deficits. The NPI rule provides a psychometrically derived method for interpreting performance across multiple tests and may be used in children 6-18 years. The rule may be useful to clinicians and scientists who wish to establish whether specific individuals or clinical populations present within expected norms versus impaired function across a battery of neuropsychological tests.

  12. Aptitude Tests and Successful College Students: The Predictive Validity of the General Aptitude Test (GAT) in Saudi Arabia

    ERIC Educational Resources Information Center

    Alnahdi, Ghaleb Hamad

    2015-01-01

    Aptitude tests should predict student success at the university level. This study examined the predictive validity of the General Aptitude Test (GAT) in Saudi Arabia. Data for 27420 students enrolled at Prince Sattam bin Abdulaziz University were analyzed. Of these students, 17565 were male students, and 9855 were female students. Multiple…

  13. Preliminary Validation of a New Measure of Negative Response Bias: The Temporal Memory Sequence Test.

    PubMed

    Hegedish, Omer; Kivilis, Naama; Hoofien, Dan

    2015-01-01

    The Temporal Memory Sequence Test (TMST) is a new measure of negative response bias (NRB) that was developed to enrich the forced-choice paradigm. The TMST does not resemble the common structure of forced-choice tests and is presented as a temporal recall memory test. The validation sample consisted of 81 participants: 21 healthy control participants, 20 coached simulators, and 40 patients with acquired brain injury (ABI). The TMST had high reliability and significantly high positive correlations with the Test of Memory Malingering and Word Memory Test effort scales. Moreover, the TMST effort scales exhibited high negative correlations with the Glasgow Coma Scale, thus validating the previously reported association between probable malingering and mild traumatic brain injury. A suggested cutoff score yielded acceptable classification rates in the ABI group as well as in the simulator and control groups. The TMST appears to be a promising measure of NRB detection, with respectable rates of reliability and construct and criterion validity.

  14. Validity of clinical color vision tests for air traffic control specialists.

    DOT National Transportation Integrated Search

    1992-10-01

    An experiment on the relationship between aeromedical color vision screening test performance and performance on color-dependent tasks of Air Traffic Control Specialists was replicated to expand the data base supporting the job-related validity of th...

  15. Validating Grammaticality Judgment Tests: Evidence from Two New Psycholinguistic Measures

    ERIC Educational Resources Information Center

    Vafaee, Payman; Suzuki, Yuichi; Kachisnke, Ilina

    2017-01-01

    Several previous factor-analytic studies on the construct validity of grammaticality judgment tests (GJTs) concluded that untimed GJTs measure explicit knowledge (EK) and timed GJTs measure implicit knowledge (IK) (Bowles, 2011; R. Ellis, 2005; R. Ellis & Loewen, 2007). It has also been shown that, irrespective of the time condition chosen,…

  16. Analytical Validation of the ReEBOV Antigen Rapid Test for Point-of-Care Diagnosis of Ebola Virus Infection

    PubMed Central

    Cross, Robert W.; Boisen, Matthew L.; Millett, Molly M.; Nelson, Diana S.; Oottamasathien, Darin; Hartnett, Jessica N.; Jones, Abigal B.; Goba, Augustine; Momoh, Mambu; Fullah, Mohamed; Bornholdt, Zachary A.; Fusco, Marnie L.; Abelson, Dafna M.; Oda, Shunichiro; Brown, Bethany L.; Pham, Ha; Rowland, Megan M.; Agans, Krystle N.; Geisbert, Joan B.; Heinrich, Megan L.; Kulakosky, Peter C.; Shaffer, Jeffrey G.; Schieffelin, John S.; Kargbo, Brima; Gbetuwa, Momoh; Gevao, Sahr M.; Wilson, Russell B.; Saphire, Erica Ollmann; Pitts, Kelly R.; Khan, Sheik Humarr; Grant, Donald S.; Geisbert, Thomas W.; Branco, Luis M.; Garry, Robert F.

    2016-01-01

    Background. Ebola virus disease (EVD) is a severe viral illness caused by Ebola virus (EBOV). The 2013–2016 EVD outbreak in West Africa is the largest recorded, with >11 000 deaths. Development of the ReEBOV Antigen Rapid Test (ReEBOV RDT) was expedited to provide a point-of-care test for suspected EVD cases. Methods. Recombinant EBOV viral protein 40 antigen was used to derive polyclonal antibodies for RDT and enzyme-linked immunosorbent assay development. ReEBOV RDT limits of detection (LOD), specificity, and interference were analytically validated on the basis of Food and Drug Administration (FDA) guidance. Results. The ReEBOV RDT specificity estimate was 95% for donor serum panels and 97% for donor whole-blood specimens. The RDT demonstrated sensitivity to 3 species of Ebolavirus (Zaire ebolavirus, Sudan ebolavirus, and Bundibugyo ebolavirus) associated with human disease, with no cross-reactivity by pathogens associated with non-EBOV febrile illness, including malaria parasites. Interference testing exhibited no reactivity by medications in common use. The LOD for antigen was 4.7 ng/test in serum and 9.4 ng/test in whole blood. Quantitative reverse transcription–polymerase chain reaction testing of nonhuman primate samples determined the range to be equivalent to 3.0 × 105–9.0 × 108 genomes/mL. Conclusions. The analytical validation presented here contributed to the ReEBOV RDT being the first antigen-based assay to receive FDA and World Health Organization emergency use authorization for this EVD outbreak, in February 2015. PMID:27587634

  17. Concurrent validity of the Harris Infant Neuromotor Test and the Alberta Infant Motor Scale.

    PubMed

    Tse, Lillian; Mayson, Tanja A; Leo, Sara; Lee, Leanna L S; Harris, Susan R; Hayes, Virginia E; Backman, Catherine L; Cameron, Dianne; Tardif, Megan

    2008-02-01

    We examined concurrent validity of scores for two infant motor screening tools, the Harris Infant Neuromotor Test (HINT) and the Alberta Infant Motor Scale, in 121 Canadian infants. Relationships between the two tests for the overall sample were as follows: r = -.83 at 4 to 6.5 months (n = 121; p < .01) and r = -.85 at 10 to 12.5 months (n = 109; p < .01), suggesting that the HINT, the newer of the two measures, is valid in determining motor delays. Each test has advantages and disadvantages, and practitioners should determine which one best meets their infant assessment needs.

  18. Validation of the Military Entrance Physical Strength Capacity Test. Technical Report 610.

    ERIC Educational Resources Information Center

    Myers, David C.; And Others

    A battery of physical ability tests was validated using a predictive, criterion-related strategy. The battery was given to 1,003 female soldiers and 980 male soldiers before they had begun Army Basic Training. Criterion measures which represented physical competency in Basic Training (physical proficiency tests, sick call, profiles, and separation…

  19. Development and Validation of Scores from an Instrument Measuring Student Test-Taking Motivation

    ERIC Educational Resources Information Center

    Eklof, Hanna

    2006-01-01

    Using the expectancy-value model of achievement motivation as a basis, this study's purpose is to develop, apply, and validate scores from a self-report instrument measuring student test-taking motivation. Sampled evidence of construct validity for the present sample indicates that a number of the items in the instrument could be used as an…

  20. Validation of an Instrument to Measure High School Students' Attitudes toward Fitness Testing

    ERIC Educational Resources Information Center

    Mercier, Kevin; Silverman, Stephen

    2014-01-01

    Purpose: The purpose of this investigation was to develop an instrument that has scores that are valid and reliable for measuring students' attitudes toward fitness testing. Method: The method involved the following steps: (a) an elicitation study, (b) item development, (c) a pilot study, and (d) a validation study. The pilot study included 427…

  1. Vacuum decay container closure integrity leak test method development and validation for a lyophilized product-package system.

    PubMed

    Patel, Jayshree; Mulhall, Brian; Wolf, Heinz; Klohr, Steven; Guazzo, Dana Morton

    2011-01-01

    A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated for container-closure integrity verification of a lyophilized product in a parenteral vial package system. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity check, and for testing product stored on stability in lieu of sterility tests. Method development and optimization challenge studies incorporated artificially defective packages representing a range of glass vial wall and sealing surface defects, as well as various elastomeric stopper defects. Method validation required 3 days of random-order replicate testing of a test sample population of negative-control, no-defect packages and positive-control, with-defect packages. Positive-control packages were prepared using vials each with a single hole laser-drilled through the glass vial wall. Hole creation and hole size certification was performed by Lenox Laser. Validation study results successfully demonstrated the vacuum decay leak test method's ability to accurately and reliably detect those packages with laser-drilled holes greater than or equal to approximately 5 μm in nominal diameter. All development and validation studies were performed at Whitehouse Analytical Laboratories in Whitehouse, NJ, under the direction of consultant Dana Guazzo of RxPax, LLC, using a VeriPac 455 Micro Leak Test System by Packaging Technologies & Inspection (Tuckahoe, NY). Bristol Myers Squibb (New Brunswick, NJ) fully subsidized all work. A leak test performed according to ASTM F2338-09 Standard Test Method for Nondestructive Detection of Leaks in Packages by Vacuum Decay Method was developed and validated to detect defects in stoppered vial packages containing lyophilized product for injection. This nondestructive leak test method is intended for use in manufacturing as an in-process package integrity

  2. Readability Level of Standardized Test Items and Student Performance: The Forgotten Validity Variable

    ERIC Educational Resources Information Center

    Hewitt, Margaret A.; Homan, Susan P.

    2004-01-01

    Test validity issues considered by test developers and school districts rarely include individual item readability levels. In this study, items from a major standardized test were examined for individual item readability level and item difficulty. The Homan-Hewitt Readability Formula was applied to items across three grade levels. Results of…

  3. Adaptation and validation of Common Object Token (COT) test into the Sinhalese language.

    PubMed

    Jeyaraman, Janani; Kumarasinghe, Chameera; Mohamed Rafi, Shabnam Fathima; Mendis, Thirimadura Lakna Amalie; Abdul Rasheed, Fathima Shameema

    2016-04-01

    This manuscript presents a translation and adaptation of the Common Object Token (COT) test, which assesses speech perception, into the Sinhalese language and an attempt to validate it for use on children with normal hearing (NH) and children with a cochlear implant (CI). Ninety-five children (70 with NH, 25 with a CI) participated in the study. The COT test was translated, back-translated, and evaluated by a team of experts until the Sinhalese translation was deemed acceptable. Data of Sinhalese children with NH and values of children with a CI were analysed. Internal reliability and consistency of the COT total score were determined. Lastly, a quick version of the COT test was created. The total mean scores and subtest mean scores improved with age for children with NH. For children with a CI, a strong relationship between the COT total score and device experience, i.e. hearing age, was found. A Quick Sinhalese COT test version, suitable for children with a CI, could be created from Subtests 2, 3, and 4. The Sinhalese COT test is valid for assessing the age-related development of speech perception and identification skills of children with NH. Results suggest that the COT is valid for use in children with a CI. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  4. Measuring verbal and non-verbal communication in aphasia: reliability, validity, and sensitivity to change of the Scenario Test.

    PubMed

    van der Meulen, Ineke; van de Sandt-Koenderman, W Mieke E; Duivenvoorden, Hugo J; Ribbers, Gerard M

    2010-01-01

    This study explores the psychometric qualities of the Scenario Test, a new test to assess daily-life communication in severe aphasia. The test is innovative in that it: (1) examines the effectiveness of verbal and non-verbal communication; and (2) assesses patients' communication in an interactive setting, with a supportive communication partner. To determine the reliability, validity, and sensitivity to change of the Scenario Test and discuss its clinical value. The Scenario Test was administered to 122 persons with aphasia after stroke and to 25 non-aphasic controls. Analyses were performed for the entire group of persons with aphasia, as well as for a subgroup of persons unable to communicate verbally (n = 43). Reliability (internal consistency, test-retest reliability, inter-judge, and intra-judge reliability) and validity (internal validity, convergent validity, known-groups validity) and sensitivity to change were examined using standard psychometric methods. The Scenario Test showed high levels of reliability. Internal consistency (Cronbach's alpha = 0.96; item-rest correlations = 0.58-0.82) and test-retest reliability (ICC = 0.98) were high. Agreement between judges in total scores was good, as indicated by the high inter- and intra-judge reliability (ICC = 0.86-1.00). Agreement in scores on the individual items was also good (square-weighted kappa values 0.61-0.92). The test demonstrated good levels of validity. A principal component analysis for categorical data identified two dimensions, interpreted as general communication and communicative creativity. Correlations with three other instruments measuring communication in aphasia, that is, Spontaneous Speech interview from the Aachen Aphasia Test (AAT), Amsterdam-Nijmegen Everyday Language Test (ANELT), and Communicative Effectiveness Index (CETI), were moderate to strong (0.50-0.85) suggesting good convergent validity. Group differences were observed between persons with aphasia and non-aphasic controls

  5. The development and validation of a test of science critical thinking for fifth graders.

    PubMed

    Mapeala, Ruslan; Siew, Nyet Moi

    2015-01-01

    The paper described the development and validation of the Test of Science Critical Thinking (TSCT) to measure the three critical thinking skill constructs: comparing and contrasting, sequencing, and identifying cause and effect. The initial TSCT consisted of 55 multiple choice test items, each of which required participants to select a correct response and a correct choice of critical thinking used for their response. Data were obtained from a purposive sampling of 30 fifth graders in a pilot study carried out in a primary school in Sabah, Malaysia. Students underwent the sessions of teaching and learning activities for 9 weeks using the Thinking Maps-aided Problem-Based Learning Module before they answered the TSCT test. Analyses were conducted to check on difficulty index (p) and discrimination index (d), internal consistency reliability, content validity, and face validity. Analysis of the test-retest reliability data was conducted separately for a group of fifth graders with similar ability. Findings of the pilot study showed that out of initial 55 administered items, only 30 items with relatively good difficulty index (p) ranged from 0.40 to 0.60 and with good discrimination index (d) ranged within 0.20-1.00 were selected. The Kuder-Richardson reliability value was found to be appropriate and relatively high with 0.70, 0.73 and 0.92 for identifying cause and effect, sequencing, and comparing and contrasting respectively. The content validity index obtained from three expert judgments equalled or exceeded 0.95. In addition, test-retest reliability showed good, statistically significant correlations ([Formula: see text]). From the above results, the selected 30-item TSCT was found to have sufficient reliability and validity and would therefore represent a useful tool for measuring critical thinking ability among fifth graders in primary science.

  6. POLYGON - A New Fundamental Movement Skills Test for 8 Year Old Children: Construction and Validation.

    PubMed

    Zuvela, Frane; Bozanic, Ana; Miletic, Durdica

    2011-01-01

    Inadequately adopted fundamental movement skills (FMS) in early childhood may have a negative impact on the motor performance in later life (Gallahue and Ozmun, 2005). The need for an efficient FMS testing in Physical Education was recognized. The aim of this paper was to construct and validate a new FMS test for 8 year old children. Ninety-five 8 year old children were used for the testing. A total of 24 new FMS tasks were constructed and only the best representatives of movement areas entered into the final test product - FMS-POLYGON. The ICC showed high values for all 24 tasks (0.83-0.97) and the factorial analysis revealed the best representatives of each movement area that entered the FMS-POLYGON: tossing and catching the volleyball against a wall, running across obstacles, carrying the medicine balls, and straight running. The ICC for the FMS-POLYGON showed a very high result (0.98) and, therefore, confirmed the test's intra-rater reliability. Concurrent validity was tested with the use of the "Test of Gross Motor Development" (TGMD-2). Correlation analysis between the newly constructed FMS-POLYGON and the TGMD-2 revealed the coefficient of -0.82 which indicates a high correlation. In conclusion, the new test for FMS assessment proved to be a reliable and valid instrument for 8 year old children. Application of this test in schools is justified and could play an important factor in physical education and sport practice. Key pointsAll 21 newly constructed tasks demonstrated high intra-rater reliability (0.83-0.97) in FMS assessment. High reliability was also noted in the FMS-POLYGON test (0.98).A high correlation was found between the FMS-POLYGON and TGMD-2 which is a confirmation of the new test's concurrent validity.The research resolved the problem of long and detailed FMS assessment by adding a new dimension using quick and effective norm-referenced approach but also covering all the most important movement areas.New and validated test can be of great use

  7. Validation of two case definitions to identify pressure ulcers using hospital administrative data

    PubMed Central

    Ho, Chester; Jiang, Jason; Eastwood, Cathy A; Wong, Holly; Weaver, Brittany; Quan, Hude

    2017-01-01

    Objective Pressure ulcer development is a quality of care indicator, as pressure ulcers are potentially preventable. Yet pressure ulcer is a leading cause of morbidity, discomfort and additional healthcare costs for inpatients. Methods are lacking for accurate surveillance of pressure ulcer in hospitals to track occurrences and evaluate care improvement strategies. The main study aim was to validate hospital discharge abstract database (DAD) in recording pressure ulcers against nursing consult reports, and to calculate prevalence of pressure ulcers in Alberta, Canada in DAD. We hypothesised that a more inclusive case definition for pressure ulcers would enhance validity of cases identified in administrative data for research and quality improvement purposes. Setting A cohort of patients with pressure ulcers were identified from enterostomal (ET) nursing consult documents at a large university hospital in 2011. Participants There were 1217 patients with pressure ulcers in ET nursing documentation that were linked to a corresponding record in DAD to validate DAD for correct and accurate identification of pressure ulcer occurrence, using two case definitions for pressure ulcer. Results Using pressure ulcer definition 1 (7 codes), prevalence was 1.4%, and using definition 2 (29 codes), prevalence was 4.2% after adjusting for misclassifications. The results were lower than expected. Definition 1 sensitivity was 27.7% and specificity was 98.8%, while definition 2 sensitivity was 32.8% and specificity was 95.9%. Pressure ulcer in both DAD and ET consultation increased with age, number of comorbidities and length of stay. Conclusion DAD underestimate pressure ulcer prevalence. Since various codes are used to record pressure ulcers in DAD, the case definition with more codes captures more pressure ulcer cases, and may be useful for monitoring facility trends. However, low sensitivity suggests that this data source may not be accurate for determining overall prevalence, and

  8. Modal testing for model validation of structures with discrete nonlinearities.

    PubMed

    Ewins, D J; Weekes, B; delli Carri, A

    2015-09-28

    Model validation using data from modal tests is now widely practiced in many industries for advanced structural dynamic design analysis, especially where structural integrity is a primary requirement. These industries tend to demand highly efficient designs for their critical structures which, as a result, are increasingly operating in regimes where traditional linearity assumptions are no longer adequate. In particular, many modern structures are found to contain localized areas, often around joints or boundaries, where the actual mechanical behaviour is far from linear. Such structures need to have appropriate representation of these nonlinear features incorporated into the otherwise largely linear models that are used for design and operation. This paper proposes an approach to this task which is an extension of existing linear techniques, especially in the testing phase, involving only just as much nonlinear analysis as is necessary to construct a model which is good enough, or 'valid': i.e. capable of predicting the nonlinear response behaviour of the structure under all in-service operating and test conditions with a prescribed accuracy. A short-list of methods described in the recent literature categorized using our framework is given, which identifies those areas in which further development is most urgently required. © 2015 The Authors.

  9. Validation of EncephalApp, Smartphone-Based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy.

    PubMed

    Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

    2015-10-01

    Detection of covert hepatic encephalopathy (CHE) is difficult, but point-of-care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test-retest reliability, and external validity. Patients with cirrhosis (n = 167; 38% with overt HE [OHE]; mean age, 55 years; mean Model for End-Stage Liver Disease score, 12) and controls (n = 114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test-retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intrahepatic portosystemic shunt placement, and before and after correction for hyponatremia, to determine external validity. All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cutoffs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic value of 0.91; the area under the receiver operator characteristic value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test-retest reliability was high (intraclass coefficient, 0.83) among 30 patients retested 1-3 months apart. OffTime+OnTime increased significantly (206 vs 255 seconds, P = .007) among 10 patients retested 33 ± 7 days after transjugular intrahepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs

  10. Blood tests showing nonpaternity-conclusive or rebuttable evidence? The Chaplin case revisited.

    PubMed

    Benson, F

    1981-09-01

    A defendant accused of being the father of an illegitimate child denies responsibility. Blood samples from the child, mother, and alleged father are studied and the results reveal that the alleged father is excluded. What weight, if any, should the court (if a trial is held) or the jury give to the evidence of nonpaternity? Should the evidence be treated as conclusive proof of nonpaternity or should other evidence be admitted in the trial to overcome the nonpaternity evidence? A medical expert might conclude that a controversy exists because of the court's questioned trustworthiness of the paternity blood testing, while a legal expert might conclude that the controversy arises because of burdens of proof. Both conclusions are valid. The Berry v. Chaplin case held in California in 1946 illustrates this circumstance. In refreshing our memories on this case, we can review the problem in light of today's knowledge.

  11. Validity and test-retest reliability in assessing current body size with figure drawings in Chinese adolescents.

    PubMed

    Lo, Wing-Sze; Ho, Sai-Yin; Wong, Bonny Yee-Man; Mak, Kwok-Kei; Lam, Tai-Hing

    2011-06-01

    The reliability and validity of Stunkard's Figure Rating Scale (FRS) as a measure of current body size (CBS) was established in Western adolescent girls but not in non-Western population. We examined the validity and test-retest reliability of Stunkard's FRS in assessing CBS among Chinese adolescents. Methods. In a school-based survey in Hong Kong, 5666 adolescents (boys: 45.1%; mean age 14.7 years) provided data on self-reported height and weight, CBS, perceived weight status, and health-related quality of life using the Medical Outcomes Study Short-Form version 2 (SF-12v2). Height and weight were also objectively measured. Spearman's correlation was used to assess construct validity, concurrent validity and test-retest reliability. Convergent and discriminant validity were good: CBS correlated strongly with weight and self-reported/measured BMI, but only weakly with SF-12v2. CBS correlated strongly with perceived weight status, showing concurrent validity. Spearman's correlation (r) for CBS was 0.78 for girls and 0.72 for boys indicating good test-retest reliability. Validity and reliability results did not differ significantly between senior and junior grade adolescents. Our findings support the use of Stunkard's FRS to measure body size among Chinese adolescents.

  12. A Case for Transforming the Criterion of a Predictive Validity Study

    ERIC Educational Resources Information Center

    Patterson, Brian F.; Kobrin, Jennifer L.

    2011-01-01

    This study presents a case for applying a transformation (Box and Cox, 1964) of the criterion used in predictive validity studies. The goals of the transformation were to better meet the assumptions of the linear regression model and to reduce the residual variance of fitted (i.e., predicted) values. Using data for the 2008 cohort of first-time,…

  13. Testing the Predictive Validity and Construct of Pathological Video Game Use

    PubMed Central

    Groves, Christopher L.; Gentile, Douglas; Tapscott, Ryan L.; Lynch, Paul J.

    2015-01-01

    Three studies assessed the construct of pathological video game use and tested its predictive validity. Replicating previous research, Study 1 produced evidence of convergent validity in 8th and 9th graders (N = 607) classified as pathological gamers. Study 2 replicated and extended the findings of Study 1 with college undergraduates (N = 504). Predictive validity was established in Study 3 by measuring cue reactivity to video games in college undergraduates (N = 254), such that pathological gamers were more emotionally reactive to and provided higher subjective appraisals of video games than non-pathological gamers and non-gamers. The three studies converged to show that pathological video game use seems similar to other addictions in its patterns of correlations with other constructs. Conceptual and definitional aspects of Internet Gaming Disorder are discussed. PMID:26694472

  14. [Attempt for development of rapid word reading test for children--evaluation of reliability and validity].

    PubMed

    Hashimoto, Ryusaku; Kashiwagi, Mitsuru; Suzuki, Shuhei

    2008-09-01

    We developed a rapid word reading test for examining the phonological processing ability of Japanese children. We prepared two versions of the test, version A and B. Each test has word and non-word tasks. Twenty-two healthy boys of third grade in primary schools participated in this validation study. For criterion related validity, we performed the serial Hiragana reading test, the sentence reading test, Raven's coloured progressive matrices (RCPM), the Token test for children, the Kana word dictation test, the standardized comprehension test of abstract words (SCTAW), and Trail Circle test. The reading times of the newly developed test correlated moderately or highly with those of the serial Hiragana reading test and the sentence reading test. However, the scores of the other tests (RCPM, Token test for children, Kana word dictation test, SCTAW, Trail Circle test) did not correlated with the reading time of the rapid word reading test. Test-retest reliabilities in the word tasks were more than moderate: 0.52 and 0.76 in versions A and B, while those in the non-word tasks were high: 0.91 and 0.88 in versions A and B. The correlation coefficient between versions A and B was 0.7 for the word tasks and 0.92 for the non-word tasks. This study showed that the rapid word reading test has substantial validity and reliability for testing the phonological processing ability of Japanese children. In addition, the non-word tasks were more suitable for selectively examining the speed of the grapheme to phoneme conversion process.

  15. Development, Sensibility, and Validity of a Systemic Autoimmune Rheumatic Disease Case Ascertainment Tool.

    PubMed

    Armstrong, Susan M; Wither, Joan E; Borowoy, Alan M; Landolt-Marticorena, Carolina; Davis, Aileen M; Johnson, Sindhu R

    2017-01-01

    Case ascertainment through self-report is a convenient but often inaccurate method to collect information. The purposes of this study were to develop, assess the sensibility, and validate a tool to identify cases of systemic autoimmune rheumatic diseases (SARD) in the outpatient setting. The SARD tool was administered to subjects sampled from specialty clinics. Determinants of sensibility - comprehensibility, feasibility, validity, and acceptability - were evaluated using a numeric rating scale from 1-7. Comprehensibility was evaluated using the Flesch Reading Ease and the Flesch-Kincaid Grade Level. Self-reported diagnoses were validated against medical records using Cohen's κ statistic. There were 141 participants [systemic lupus erythematosus (SLE), systemic sclerosis (SSc), rheumatoid arthritis, Sjögren syndrome (SS), inflammatory myositis (polymyositis/dermatomyositis; PM/DM), and controls] who completed the questionnaire. The Flesch Reading Ease score was 77.1 and the Flesch-Kincaid Grade Level was 4.4. Respondents endorsed (mean ± SD) comprehensibility (6.12 ± 0.92), feasibility (5.94 ± 0.81), validity (5.35 ± 1.10), and acceptability (3.10 ± 2.03). The SARD tool had a sensitivity of 0.91 (95% CI 0.88-0.94) and a specificity of 0.99 (95% CI 0.96-1.00). The agreement between the SARD tool and medical record was κ = 0.82 (95% CI 0.77-0.88). Subgroup analysis by SARD found κ coefficients for SLE to be κ = 0.88 (95% CI 0.79-0.97), SSc κ = 1.0 (95% CI 1.0-1.0), PM/DM κ = 0.72 (95% CI 0.49-0.95), and SS κ = 0.85 (95% CI 0.71-0.99). The screening questions had sensitivity ranging from 0.96 to 1.0 and specificity ranging from 0.88 to 1.0. This SARD case ascertainment tool has demonstrable sensibility and validity. The use of both screening and confirmatory questions confers added accuracy.

  16. A Comparison of Validity Rates between Paper-and-Pencil and Computerized Testing with the MMPI-2

    ERIC Educational Resources Information Center

    Blazek, Nicole L.; Forbey, Johnathan D.

    2011-01-01

    Although the use of computerized testing in psychopathology assessment has increased in recent years, limited research has examined the impact of this format in terms of potential differences in test validity rates. The current study explores potential differences in the rates of valid and invalid Minnesota Multiphasic Personality Inventory--2…

  17. Prevalence of Invalid Performance on Baseline Testing for Sport-Related Concussion by Age and Validity Indicator.

    PubMed

    Abeare, Christopher A; Messa, Isabelle; Zuccato, Brandon G; Merker, Bradley; Erdodi, Laszlo

    2018-03-12

    Estimated base rates of invalid performance on baseline testing (base rates of failure) for the management of sport-related concussion range from 6.1% to 40.0%, depending on the validity indicator used. The instability of this key measure represents a challenge in the clinical interpretation of test results that could undermine the utility of baseline testing. To determine the prevalence of invalid performance on baseline testing and to assess whether the prevalence varies as a function of age and validity indicator. This retrospective, cross-sectional study included data collected between January 1, 2012, and December 31, 2016, from a clinical referral center in the Midwestern United States. Participants included 7897 consecutively tested, equivalently proportioned male and female athletes aged 10 to 21 years, who completed baseline neurocognitive testing for the purpose of concussion management. Baseline assessment was conducted with the Immediate Postconcussion Assessment and Cognitive Testing (ImPACT), a computerized neurocognitive test designed for assessment of concussion. Base rates of failure on published ImPACT validity indicators were compared within and across age groups. Hypotheses were developed after data collection but prior to analyses. Of the 7897 study participants, 4086 (51.7%) were male, mean (SD) age was 14.71 (1.78) years, 7820 (99.0%) were primarily English speaking, and the mean (SD) educational level was 8.79 (1.68) years. The base rate of failure ranged from 6.4% to 47.6% across individual indicators. Most of the sample (55.7%) failed at least 1 of 4 validity indicators. The base rate of failure varied considerably across age groups (117 of 140 [83.6%] for those aged 10 years to 14 of 48 [29.2%] for those aged 21 years), representing a risk ratio of 2.86 (95% CI, 2.60-3.16; P < .001). The results for base rate of failure were surprisingly high overall and varied widely depending on the specific validity indicator and the age of the

  18. Author Response to Sabour (2018), "Comment on Hall et al. (2017), 'How to Choose Between Measures of Tinnitus Loudness for Clinical Research? A Report on the Reliability and Validity of an Investigator-Administered Test and a Patient-Reported Measure Using Baseline Data Collected in a Phase IIa Drug Trial'".

    PubMed

    Hall, Deborah A; Mehta, Rajnikant L; Fackrell, Kathryn

    2018-03-08

    The authors respond to a letter to the editor (Sabour, 2018) concerning the interpretation of validity in the context of evaluating treatment-related change in tinnitus loudness over time. The authors refer to several landmark methodological publications and an international standard concerning the validity of patient-reported outcome measurement instruments. The tinnitus loudness rating performed better against our reported acceptability criteria for (face and convergent) validity than did the tinnitus loudness matching test. It is important to distinguish between tests that evaluate the validity of measuring treatment-related change over time and tests that quantify the accuracy of diagnosing tinnitus as a case and non-case.

  19. A Method of Q-Matrix Validation for the Linear Logistic Test Model

    PubMed Central

    Baghaei, Purya; Hohensinn, Christine

    2017-01-01

    The linear logistic test model (LLTM) is a well-recognized psychometric model for examining the components of difficulty in cognitive tests and validating construct theories. The plausibility of the construct model, summarized in a matrix of weights, known as the Q-matrix or weight matrix, is tested by (1) comparing the fit of LLTM with the fit of the Rasch model (RM) using the likelihood ratio (LR) test and (2) by examining the correlation between the Rasch model item parameters and LLTM reconstructed item parameters. The problem with the LR test is that it is almost always significant and, consequently, LLTM is rejected. The drawback of examining the correlation coefficient is that there is no cut-off value or lower bound for the magnitude of the correlation coefficient. In this article we suggest a simulation method to set a minimum benchmark for the correlation between item parameters from the Rasch model and those reconstructed by the LLTM. If the cognitive model is valid then the correlation coefficient between the RM-based item parameters and the LLTM-reconstructed item parameters derived from the theoretical weight matrix should be greater than those derived from the simulated matrices. PMID:28611721

  20. Reliability and Concurrent Validity of Dynamic Rotator Stability Test-A Cross Sectional study.

    PubMed

    Binoy Mathew, K V; Eapen, Charu; Kumar, P Senthil

    2012-01-01

    To find intra rater and inter rater reliability of Dynamic Rotator Stability Test (DRST) and to find concurrent validity of Dynamic Rotator Stability Test (DRST) with University of Pennsylvania Shoulder Score (PENN) Scale. 40 subjects of either gender between the age group of 18-70 with painful shoulder conditions of musculoskeletal origin was selected through convenient sampling. Tester 1 and tester 2 administered DRST and PENN scale randomly. In a subgroup of 20 subjects DRST was administered by both the testers to find the inter rater reliability. 180° Standard Universal Goniometer was used to take measurements. For intra-rater reliability, all the test variables were showing highly significant correlation (p=.94 - 1). For inter -rater, with tester 2, test variables like position, ROM, force, direction of abnormal translation, pain during the test, compensatory movement during test were found to be significant (p=.71-1).only some variables of DRST showed significant correlation with PENN scale (P=.320-.450). Dynamic Rotator Stability Test has good intra rater and moderate inter rater reliability. Concurrent validity of Dynamic Rotator Stability Test was found to be poor when compared to PENN Shoulder Score.

  1. Validation of Cardiovascular Parameters During NASA's Functional Task Test

    NASA Technical Reports Server (NTRS)

    Arzeno, N. M.; Stenger, M. B.; Bloomberg, J. J.; Platts, Steven H.

    2008-01-01

    Microgravity-induced physiological changes, including cardiovascular deconditioning may impair crewmembers f capabilities during exploration missions on the Moon and Mars. The Functional Task Test (FTT), which will be used to assess task performance in short and long duration astronauts, consists of 7 functional tests to evaluate crewmembers f ability to perform activities to be conducted in a partial-gravity environment or following an emergency landing on Earth. The Recovery from Fall/Stand Test (RFST) tests both the subject fs ability to get up from a prone position and orthostatic intolerance. PURPOSE: Crewmembers have never become presyncopal in the first 3 min of quiet stand, yet it is unknown whether 3 min is long enough to cause similar heart rate fluctuations to a 5-min stand. The purpose of this study was to validate and test the reliability of heart rate variability (HRV) analysis of a 3-min quiet stand. METHODS: To determine the validity of using 3 vs. 5-min of standing to assess HRV, 7 healthy subjects remained in a prone position for 2 min, stood up quickly and stood quietly for 6 min. ECG and continuous blood pressure data were recorded. Mean R-R interval and spectral HRV were measured in minutes 0-3 and 0-5 following the heart rate transient due to standing. Significant differences between the segments were determined by a paired t-test. To determine the reliability of the 3-min stand test, 13 healthy subjects completed 3 trials of the complete FTT on separate days, including the RFST with a 3-min stand test. Analysis of variance (ANOVA) was performed on the HRV measures. RESULTS: Spectral HRV measures reflecting autonomic activity were not different (p>0.05) during the 0-3 and 0-5 min segment (mean R-R interval: 738+/-74 ms, 728+/-69 ms; low frequency to high frequency ratio: 6.5+/-2.2, 7.7+/-2.7; normalized high frequency: 0.19+/-0.03, 0.18+/-0.04). The average coefficient of variation for mean R-R interval, systolic and diastolic blood pressures

  2. Diagnostic validity of physical examination tests for common knee disorders: An overview of systematic reviews and meta-analysis.

    PubMed

    Décary, Simon; Ouellet, Philippe; Vendittoli, Pascal-André; Roy, Jean-Sébastien; Desmeules, François

    2017-01-01

    More evidence on diagnostic validity of physical examination tests for knee disorders is needed to lower frequently used and costly imaging tests. To conduct a systematic review of systematic reviews (SR) and meta-analyses (MA) evaluating the diagnostic validity of physical examination tests for knee disorders. A structured literature search was conducted in five databases until January 2016. Methodological quality was assessed using the AMSTAR. Seventeen reviews were included with mean AMSTAR score of 5.5 ± 2.3. Based on six SR, only the Lachman test for ACL injuries is diagnostically valid when individually performed (Likelihood ratio (LR+):10.2, LR-:0.2). Based on two SR, the Ottawa Knee Rule is a valid screening tool for knee fractures (LR-:0.05). Based on one SR, the EULAR criteria had a post-test probability of 99% for the diagnosis of knee osteoarthritis. Based on two SR, a complete physical examination performed by a trained health provider was found to be diagnostically valid for ACL, PCL and meniscal injuries as well as for cartilage lesions. When individually performed, common physical tests are rarely able to rule in or rule out a specific knee disorder, except the Lachman for ACL injuries. There is low-quality evidence concerning the validity of combining history elements and physical tests. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Perspectives on Validation of High-Throughput Assays Supporting 21st Century Toxicity Testing

    EPA Science Inventory

    In vitro high-throughput screening (HTS) assays are seeing increasing use in toxicity testing. HTS assays can simultaneously test many chemicals but have seen limited use in the regulatory arena, in part because of the need to undergo rigorous, time-consuming formal validation. ...

  4. Building Physics Test Cases | Buildings | NREL

    Science.gov Websites

    building physics test cases in BESTEST-EX. In these cases, the model inputs that describe the house are programs. This diagram provides an overview of the BESTEST-EX physics case process. On the left side of the diagram is a box labeled "BESTEST-EX Document" with a list that contains two bulleted items. The

  5. Establishing the Validity and Reliability of Course Evaluation Questionnaires

    ERIC Educational Resources Information Center

    Kember, David; Leung, Doris Y. P.

    2008-01-01

    This article uses the case of designing a new course questionnaire to discuss the issues of validity, reliability and diagnostic power in good questionnaire design. Validity is often not well addressed in course questionnaire design as there are no straightforward tests that can be applied to an individual instrument. The authors propose the…

  6. Criterion-Related Validity of Sit-and-Reach Tests for Estimating Hamstring and Lumbar Extensibility: a Meta-Analysis

    PubMed Central

    Mayorga-Vega, Daniel; Merino-Marban, Rafael; Viciana, Jesús

    2014-01-01

    The main purpose of the present meta-analysis was to examine the scientific literature on the criterion-related validity of sit-and-reach tests for estimating hamstring and lumbar extensibility. For this purpose relevant studies were searched from seven electronic databases dated up through December 2012. Primary outcomes of criterion-related validity were Pearson´s zero-order correlation coefficients (r) between sit-and-reach tests and hamstrings and/or lumbar extensibility criterion measures. Then, from the included studies, the Hunter- Schmidt´s psychometric meta-analysis approach was conducted to estimate population criterion- related validity of sit-and-reach tests. Firstly, the corrected correlation mean (rp), unaffected by statistical artefacts (i.e., sampling error and measurement error), was calculated separately for each sit-and-reach test. Subsequently, the three potential moderator variables (sex of participants, age of participants, and level of hamstring extensibility) were examined by a partially hierarchical analysis. Of the 34 studies included in the present meta-analysis, 99 correlations values across eight sit-and-reach tests and 51 across seven sit-and-reach tests were retrieved for hamstring and lumbar extensibility, respectively. The overall results showed that all sit-and-reach tests had a moderate mean criterion-related validity for estimating hamstring extensibility (rp = 0.46-0.67), but they had a low mean for estimating lumbar extensibility (rp = 0. 16-0.35). Generally, females, adults and participants with high levels of hamstring extensibility tended to have greater mean values of criterion-related validity for estimating hamstring extensibility. When the use of angular tests is limited such as in a school setting or in large scale studies, scientists and practitioners could use the sit-and-reach tests as a useful alternative for hamstring extensibility estimation, but not for estimating lumbar extensibility. Key Points Overall sit

  7. Video game addiction test: validity and psychometric characteristics.

    PubMed

    van Rooij, Antonius J; Schoenmakers, Tim M; van den Eijnden, Regina J J M; Vermulst, Ad A; van de Mheen, Dike

    2012-09-01

    The study explores the reliability, validity, and measurement invariance of the Video game Addiction Test (VAT). Game-addiction problems are often linked to Internet enabled online games; the VAT has the unique benefit that it is theoretically and empirically linked to Internet addiction. The study used data (n=2,894) from a large-sample paper-and-pencil questionnaire study, conducted in 2009 on secondary schools in Netherlands. Thus, the main source of data was a large sample of schoolchildren (aged 13-16 years). Measurements included the proposed VAT, the Compulsive Internet Use Scale, weekly hours spent on various game types, and several psychosocial variables. The VAT demonstrated excellent reliability, excellent construct validity, a one-factor model fit, and a high degree of measurement invariance across gender, ethnicity, and learning year, indicating that the scale outcomes can be compared across different subgroups with little bias. In summary, the VAT can be helpful in the further study of video game addiction, and it contributes to the debate on possible inclusion of behavioral addictions in the upcoming DSM-V.

  8. A Model-Based Method for Content Validation of Automatically Generated Test Items

    ERIC Educational Resources Information Center

    Zhang, Xinxin; Gierl, Mark

    2016-01-01

    The purpose of this study is to describe a methodology to recover the item model used to generate multiple-choice test items with a novel graph theory approach. Beginning with the generated test items and working backward to recover the original item model provides a model-based method for validating the content used to automatically generate test…

  9. Validity and Reproducibility of an Incremental Sit-To-Stand Exercise Test for Evaluating Anaerobic Threshold in Young, Healthy Individuals.

    PubMed

    Nakamura, Keisuke; Ohira, Masayoshi; Yokokawa, Yoshiharu; Nagasawa, Yuya

    2015-12-01

    a small space and a chair.The ISTS with arm support is valid and reproducible, and is a safe test for evaluating AT in healthy young adults.For evaluating the AT, the ISTS may serve as a valid alternative to conventional CPX, using either a cycle ergometer or treadmill, in cases where the latter methods are difficult to implement.

  10. Validation of a Syndromic Case Definition for Detecting Emergency Department Visits Potentially Related to Marijuana.

    PubMed

    DeYoung, Kathryn; Chen, Yushiuan; Beum, Robert; Askenazi, Michele; Zimmerman, Cali; Davidson, Arthur J

    Reliable methods are needed to monitor the public health impact of changing laws and perceptions about marijuana. Structured and free-text emergency department (ED) visit data offer an opportunity to monitor the impact of these changes in near-real time. Our objectives were to (1) generate and validate a syndromic case definition for ED visits potentially related to marijuana and (2) describe a method for doing so that was less resource intensive than traditional methods. We developed a syndromic case definition for ED visits potentially related to marijuana, applied it to BioSense 2.0 data from 15 hospitals in the Denver, Colorado, metropolitan area for the period September through October 2015, and manually reviewed each case to determine true positives and false positives. We used the number of visits identified by and the positive predictive value (PPV) for each search term and field to refine the definition for the second round of validation on data from February through March 2016. Of 126 646 ED visits during the first period, terms in 524 ED visit records matched ≥1 search term in the initial case definition (PPV, 92.7%). Of 140 932 ED visits during the second period, terms in 698 ED visit records matched ≥1 search term in the revised case definition (PPV, 95.7%). After another revision, the final case definition contained 6 keywords for marijuana or derivatives and 5 diagnosis codes for cannabis use, abuse, dependence, poisoning, and lung disease. Our syndromic case definition and validation method for ED visits potentially related to marijuana could be used by other public health jurisdictions to monitor local trends and for other emerging concerns.

  11. Development, test-retest reliability, and construct validity of the resistance training skills battery.

    PubMed

    Lubans, David R; Smith, Jordan J; Harries, Simon K; Barnett, Lisa M; Faigenbaum, Avery D

    2014-05-01

    The aim of this study was to describe the development and assess test-retest reliability and construct validity of the Resistance Training Skills Battery (RTSB) for adolescents. The RTSB provides an assessment of resistance training skill competency and includes 6 exercises (i.e., body weight squat, push-up, lunge, suspended row, standing overhead press, and front support with chest touches). Scoring for each skill is based on the number of performance criteria successfully demonstrated. An overall resistance training skill quotient (RTSQ) is created by adding participants' scores for the 6 skills. Participants (44 boys and 19 girls, mean age = 14.5 ± 1.2 years) completed the RTSB on 2 occasions separated by 7 days. Participants also completed the following fitness tests, which were used to create a muscular fitness score (MFS): handgrip strength, timed push-up, and standing long jump tests. Intraclass correlation (ICC), paired samples t-tests, and typical error were used to assess test-retest reliability. To assess construct validity, gender and RTSQ were entered into a regression model predicting MFS. The rank order repeatability of the RTSQ was high (ICC = 0.88). The model explained 39% of the variance in MFS (p ≤ 0.001) and RTSQ (r = 0.40, p ≤ 0.001) was a significant predictor. This study has demonstrated the construct validity and test-retest reliability of the RTSB in a sample of adolescents. The RTSB can reliably rank participants in regards to their resistance training competency and has the necessary sensitivity to detect small changes in resistance training skill proficiency.

  12. Development and Validation of a Computerized-Adaptive Test for PTSD (P-CAT).

    PubMed

    Eisen, Susan V; Schultz, Mark R; Ni, Pengsheng; Haley, Stephen M; Smith, Eric G; Spiro, Avron; Osei-Bonsu, Princess E; Nordberg, Sam; Jette, Alan M

    2016-10-01

    The primary purpose was to develop, field test, and validate a computerized-adaptive test (CAT) for posttraumatic stress disorder (PTSD) to enhance PTSD assessment and decrease the burden of symptom monitoring. Data sources included self-report and interviewer-administered diagnostic interviews. The sample included 1,288 veterans. In phase 1, 89 items from a previously developed PTSD item pool were administered to a national sample of 1,085 veterans. A multidimensional graded-response item response theory model was used to calibrate items for incorporation into a CAT for PTSD (P-CAT). In phase 2, in a separate sample of 203 veterans, the P-CAT was validated against three other self-report measures (PTSD Checklist, Civilian Version; Mississippi Scale for Combat-Related PTSD; and Primary Care PTSD Screen) and the PTSD module of the Structured Clinical Interview for DSM-IV. A bifactor model with one general PTSD factor and four subfactors consistent with DSM-5 (reexperiencing, avoidance, negative mood-cognitions, and arousal), yielded good fit. The P-CAT discriminated veterans with PTSD from those with other mental health conditions and those with no mental health conditions (Cohen's d effect sizes >.90). The P-CAT also discriminated those with and without a PTSD diagnosis and those who screened positive versus negative for PTSD. Concurrent validity was supported by high correlations (r=.85-.89) with the validation measures. The P-CAT appears to be a promising tool for efficient and accurate assessment of PTSD symptomatology. Further testing is needed to evaluate its responsiveness to change. With increasing availability of computers and other technologies, CAT may be a viable and efficient assessment method.

  13. POLYGON - A New Fundamental Movement Skills Test for 8 Year Old Children: Construction and Validation

    PubMed Central

    Zuvela, Frane; Bozanic, Ana; Miletic, Durdica

    2011-01-01

    Inadequately adopted fundamental movement skills (FMS) in early childhood may have a negative impact on the motor performance in later life (Gallahue and Ozmun, 2005). The need for an efficient FMS testing in Physical Education was recognized. The aim of this paper was to construct and validate a new FMS test for 8 year old children. Ninety-five 8 year old children were used for the testing. A total of 24 new FMS tasks were constructed and only the best representatives of movement areas entered into the final test product - FMS-POLYGON. The ICC showed high values for all 24 tasks (0.83-0.97) and the factorial analysis revealed the best representatives of each movement area that entered the FMS-POLYGON: tossing and catching the volleyball against a wall, running across obstacles, carrying the medicine balls, and straight running. The ICC for the FMS-POLYGON showed a very high result (0.98) and, therefore, confirmed the test’s intra-rater reliability. Concurrent validity was tested with the use of the “Test of Gross Motor Development” (TGMD-2). Correlation analysis between the newly constructed FMS-POLYGON and the TGMD-2 revealed the coefficient of -0.82 which indicates a high correlation. In conclusion, the new test for FMS assessment proved to be a reliable and valid instrument for 8 year old children. Application of this test in schools is justified and could play an important factor in physical education and sport practice. Key points All 21 newly constructed tasks demonstrated high intra-rater reliability (0.83-0.97) in FMS assessment. High reliability was also noted in the FMS-POLYGON test (0.98). A high correlation was found between the FMS-POLYGON and TGMD-2 which is a confirmation of the new test’s concurrent validity. The research resolved the problem of long and detailed FMS assessment by adding a new dimension using quick and effective norm-referenced approach but also covering all the most important movement areas. New and validated test can be

  14. Validation of a novel saliva-based ELISA test for diagnosing tapeworm burden in horses.

    PubMed

    Lightbody, Kirsty L; Davis, Paul J; Austin, Corrine J

    2016-06-01

    Tapeworm infections pose a significant threat to equine health as they are associated with clinical cases of colic. Diagnosis of tapeworm burden using fecal egg counts (FECs) is unreliable, and, although a commercial serologic ELISA for anti-tapeworm antibodies is available, it requires a veterinarian to collect the blood sample. A reliable diagnostic test using an owner-accessible sample such as saliva could provide a cost-effective alternative for tapeworm testing in horses, and allow targeted deworming strategies. The purpose of the study was to statistically validate a saliva tapeworm ELISA test and compare to a tapeworm-specific IgG(T) serologic ELISA. Serum samples (139) and matched saliva samples (104) were collected from horses at a UK abattoir. The ileocecal junction and cecum were visually examined for tapeworms and any present were counted. Samples were analyzed using a serologic ELISA and the saliva tapeworm test. The test results were compared to tapeworm numbers and the various data sets were statistically analyzed. Saliva scores had strong positive correlations with both infection intensity (0.74) and serologic results (Spearman's rank coefficients; 0.74 and 0.86, respectively). The saliva tapeworm test was capable of identifying the presence of one or more tapeworms with 83% sensitivity and 85% specificity. Importantly, no high-burden (more than 20 tapeworms) horses were misdiagnosed. The saliva tapeworm test has statistical accuracy for detecting tapeworm burdens in horses with 83% sensitivity and 85% specificity, similar to those of the serologic ELISA (85% and 78%, respectively). © 2016 American Society for Veterinary Clinical Pathology.

  15. The validity and reliability of the ADL-Glittre test for children.

    PubMed

    Martins, Renata; Assumpção, Maíra S de; Bobbio, Tatiana G; Mayer, Anamaria F; Schivinski, Camila

    2018-04-16

    The ADL-Glittre was created to assess more comprehensively the essential activities of daily living in adults with chronic obstructive pulmonary disease. The aim of this study was to validate the ADL-Glittre test adapted for children (TGlittre-P) and verify its reliability. This is a cross-sectional study with 87 healthy children aged 6 to 14 years (mean 10.36 ± 2.32 years). Biometric and spirometry data were collected from all participants. On the same day, part of the sample (36 children included in the validation process) performed two 6MWT and two TGlittre-P (30-minute interval between them). The other part of the sample just performed two TGlittre-P for the reliability process. Pearson and Spearman correlation tests were used to verify the correlation between the time spent on the TGlittre-P and the distance walked in the 6MWT. The intraclass correlation coefficient (ICC) was also used to assess the reproducibility of the TGlittre-P. The TGlittre-P showed a moderate negative correlation with the 6MWT (r = -0.490; p = 0.002; 95%CI -0.712 to -0.233). However, the behavior of the physiological variables that were monitored during the tests was similar and showed to be reproducible (ICC = 0.843; p = 0.000; 95%CI 0.695 to 0.911). The TGlittre-P proved to be a valid and reliable assessment of the functional capacity of healthy children aged 6 to 14 years.

  16. Reliability and validity of functional performance tests in dancers with hip dysfunction.

    PubMed

    Kivlan, Benjamin R; Carcia, Christopher R; Clemente, F Richard; Phelps, Amy L; Martin, Robroy L

    2013-08-01

    Quasi-experimental, repeated measures. Functional performance tests that identify hip joint impairments and assess the effect of intervention have not been adequately described for dancers. The purpose of this study was to examine the reliability and validity of hop and balance tests among a group of dancers with musculoskeletal pain in the hip region. NINETEEN FEMALE DANCERS (AGE: 18.90±1.11 years; height: 164.85±6.95 cm; weight: 60.37±8.29 kg) with unilateral hip pain were assessed utilizing the cross-over reach, medial triple hop, lateral triple hop, and cross-over hop tests on two occasions, 2 days apart. Test-retest reliability and comparisons between the involved and uninvolved side for each respective test were determined. Intra-class correlation coefficients for the functional performance tests ranged from 0.89-0.96. The cross-over reach test had a SEM of 2.79 cm and a MDC of 7.73 cm. The medial and lateral triple hop tests had SEM values of 7.51 cm and 8.17 cm, and MDC values of 20.81 cm and 22.62 cm, respectively. The SEM was 0.15 seconds and the MDC was 0.42 seconds for the cross-over hop test. Performance on the medial triple hop test was significantly less on the involved side (370.21±38.26 cm) compared to the uninvolved side (388.05±41.49 cm); t(18) = -4.33, p<0.01. The side-to-side comparisons of the cross-over reach test (involved mean=61.68±10.9 cm; uninvolved mean=61.69±8.63 cm); t(18) = -0.004, p=0.99, lateral triple hop test (involved mean=306.92±35.79 cm; uninvolved mean=310.68±24.49 cm); t(18) = -0.55, p=0.59, and cross-over hop test (involved mean=2.49±0.34 seconds; uninvolved mean= 2.61±0.42 seconds; t(18) = -1.84, p=0.08) were not statistically different between sides. The functional performance tests used in this study can be reliably performed on dancers with unilateral hip pain. The medial triple hop test was the only functional performance test with evidence of validity in side-to-side comparisons. These results suggest that

  17. RELIABILITY AND VALIDITY OF FUNCTIONAL PERFORMANCE TESTS IN DANCERS WITH HIP DYSFUNCTION

    PubMed Central

    Carcia, Christopher R.; Clemente, F. Richard; Phelps, Amy L.; Martin, RobRoy L.

    2013-01-01

    Study Design: Quasi-experimental, repeated measures. Purpose/Background: Functional performance tests that identify hip joint impairments and assess the effect of intervention have not been adequately described for dancers. The purpose of this study was to examine the reliability and validity of hop and balance tests among a group of dancers with musculoskeletal pain in the hip region. Methods: Nineteen female dancers (age: 18.90±1.11 years; height: 164.85±6.95 cm; weight: 60.37±8.29 kg) with unilateral hip pain were assessed utilizing the cross-over reach, medial triple hop, lateral triple hop, and cross-over hop tests on two occasions, 2 days apart. Test-retest reliability and comparisons between the involved and uninvolved side for each respective test were determined. Results: Intra-class correlation coefficients for the functional performance tests ranged from 0.89-0.96. The cross-over reach test had a SEM of 2.79 cm and a MDC of 7.73 cm. The medial and lateral triple hop tests had SEM values of 7.51 cm and 8.17 cm, and MDC values of 20.81 cm and 22.62 cm, respectively. The SEM was 0.15 seconds and the MDC was 0.42 seconds for the cross-over hop test. Performance on the medial triple hop test was significantly less on the involved side (370.21±38.26 cm) compared to the uninvolved side (388.05±41.49 cm); t(18) = −4.33, p<0.01. The side-to-side comparisons of the cross-over reach test (involved mean=61.68±10.9 cm; uninvolved mean=61.69±8.63 cm); t(18) = −0.004, p=0.99, lateral triple hop test (involved mean=306.92±35.79 cm; uninvolved mean=310.68±24.49 cm); t(18) = −0.55, p=0.59, and cross-over hop test (involved mean=2.49±0.34 seconds; uninvolved mean= 2.61±0.42 seconds; t(18) = −1.84, p=0.08) were not statistically different between sides. Conclusion: The functional performance tests used in this study can be reliably performed on dancers with unilateral hip pain. The medial triple hop test was the only functional performance test with

  18. Official Position of the American Academy of Clinical Neuropsychology Social Security Administration Policy on Validity Testing: Guidance and Recommendations for Change.

    PubMed

    Chafetz, M D; Williams, M A; Ben-Porath, Y S; Bianchini, K J; Boone, K B; Kirkwood, M W; Larrabee, G J; Ord, J S

    2015-01-01

    The milestone publication by Slick, Sherman, and Iverson (1999) of criteria for determining malingered neurocognitive dysfunction led to extensive research on validity testing. Position statements by the National Academy of Neuropsychology and the American Academy of Clinical Neuropsychology (AACN) recommended routine validity testing in neuropsychological evaluations. Despite this widespread scientific and professional support, the Social Security Administration (SSA) continued to discourage validity testing, a stance that led to a congressional initiative for SSA to reevaluate their position. In response, SSA commissioned the Institute of Medicine (IOM) to evaluate the science concerning the validation of psychological testing. The IOM concluded that validity assessment was necessary in psychological and neuropsychological examinations (IOM, 2015 ). The AACN sought to provide independent expert guidance and recommendations concerning the use of validity testing in disability determinations. A panel of contributors to the science of validity testing and its application to the disability process was charged with describing why the disability process for SSA needs improvement, and indicating the necessity for validity testing in disability exams. This work showed how the determination of malingering is a probability proposition, described how different types of validity tests are appropriate, provided evidence concerning non-credible findings in children and low-functioning individuals, and discussed the appropriate evaluation of pain disorders typically seen outside of mental consultations. A scientific plan for validity assessment that additionally protects test security is needed in disability determinations and in research on classification accuracy of disability decisions.

  19. Validation of a computer case definition for sudden cardiac death in opioid users.

    PubMed

    Kawai, Vivian K; Murray, Katherine T; Stein, C Michael; Cooper, William O; Graham, David J; Hall, Kathi; Ray, Wayne A

    2012-08-31

    To facilitate the use of automated databases for studies of sudden cardiac death, we previously developed a computerized case definition that had a positive predictive value between 86% and 88%. However, the definition has not been specifically validated for prescription opioid users, for whom out-of-hospital overdose deaths may be difficult to distinguish from sudden cardiac death. We assembled a cohort of persons 30-74 years of age prescribed propoxyphene or hydrocodone who had no life-threatening non-cardiovascular illness, diagnosed drug abuse, residence in a nursing home in the past year, or hospital stay within the past 30 days. Medical records were sought for a sample of 140 cohort deaths within 30 days of a prescription fill meeting the computer case definition. Of the 140 sampled deaths, 81 were adjudicated; 73 (90%) were sudden cardiac deaths. Two deaths had possible opioid overdose; after removing these two the positive predictive value was 88%. These findings are consistent with our previous validation studies and suggest the computer case definition of sudden cardiac death is a useful tool for pharmacoepidemiologic studies of opioid analgesics.

  20. Wheelchair Shuttle Test for Assessing Aerobic Fitness in Youth With Spina Bifida: Validity and Reliability

    PubMed Central

    de Groot, Janke F.; Backx, Frank J.G.; Benner, Joyce; Kruitwagen, Cas L.J.J.; Takken, Tim

    2017-01-01

    Abstract Background Testing aerobic fitness in youth is important because of expected relationships with health. Objective The purpose of the study was to estimate the validity and reliability of the Shuttle Ride Test in youth who have spina bifida and use a wheelchair for mobility and sport. Design Ths study is a validity and reliability study. Methods The Shuttle Ride Test, Graded Wheelchair Propulsion Test, and skill-related fitness tests were administered to 33 participants for the validity study (age = 14.5 ± 3.1 y) and to 28 participants for the reliability study (age = 14.7 ± 3.3 y). Results No significant differences were found between the Graded Wheelchair Propulsion Test and the Shuttle Ride Test for most cardiorespiratory responses. Correlations between the Graded Wheelchair Propulsion Test and the Shuttle Ride Test were moderate to high (r = .55–.97). The variance in peak oxygen uptake (VO2peak) could be predicted for 77% of the participants by height, number of shuttles completed, and weight, with large prediction intervals. High correlations were found between number of shuttles completed and skill-related fitness tests (CI = .73 to −.92). Intraclass correlation coefficients were high (.77–.98), with a smallest detectable change of 1.5 for number of shuttles completed and with coefficients of variation of 6.2% and 6.4% for absolute VO2peak and relative VO2peak, respectively. Conclusions When measuring VO2peak directly by using a mobile gas analysis system, the Shuttle Ride Test is highly valid for testing VO2peak in youth who have spina bifida and use a wheelchair for mobility and sport. The outcome measure of number of shuttles represents aerobic fitness and is also highly correlated with both anaerobic performance and agility. It is not possible to predict VO2peak accurately by using the number of shuttles completed. Moreover, the Shuttle Ride Test is highly reliable in youth with spina bifida, with a good smallest detectable change for the

  1. Validation of a Dumbbell Body Sway Test in Olympic Air Pistol Shooting

    PubMed Central

    Mon, Daniel; Zakynthinaki, Maria S.; Cordente, Carlos A.; Monroy Antón, Antonio; López Jiménez, David

    2014-01-01

    We present and validate a test able to provide reliable body sway measurements in air pistol shooting, without the use of a gun. 46 senior male pistol shooters who participated in Spanish air pistol championships participated in the study. Body sway data of two static bipodal balance tests have been compared: during the first test, shooting was simulated by use of a dumbbell, while during the second test the shooters own pistol was used. Both tests were performed the day previous to the competition, during the official training time and at the training stands to simulate competition conditions. The participantś performance was determined as the total score of 60 shots at competition. Apart from the commonly used variables that refer to movements of the shooters centre of pressure (COP), such as COP displacements on the X and Y axes, maximum and average COP velocities and total COP area, the present analysis also included variables that provide information regarding the axes of the COP ellipse (length and angle in respect to X). A strong statistically significant correlation between the two tests was found (with an interclass correlation varying between 0.59 and 0.92). A statistically significant inverse linear correlation was also found between performance and COP movements. The study concludes that dumbbell tests are perfectly valid for measuring body sway by simulating pistol shooting. PMID:24756067

  2. Evaluation of PDA Technical Report No 33. Statistical Testing Recommendations for a Rapid Microbiological Method Case Study.

    PubMed

    Murphy, Thomas; Schwedock, Julie; Nguyen, Kham; Mills, Anna; Jones, David

    2015-01-01

    New recommendations for the validation of rapid microbiological methods have been included in the revised Technical Report 33 release from the PDA. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This case study applies those statistical methods to accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological methods system being evaluated for water bioburden testing. Results presented demonstrate that the statistical methods described in the PDA Technical Report 33 chapter can all be successfully applied to the rapid microbiological method data sets and gave the same interpretation for equivalence to the standard method. The rapid microbiological method was in general able to pass the requirements of PDA Technical Report 33, though the study shows that there can be occasional outlying results and that caution should be used when applying statistical methods to low average colony-forming unit values. Prior to use in a quality-controlled environment, any new method or technology has to be shown to work as designed by the manufacturer for the purpose required. For new rapid microbiological methods that detect and enumerate contaminating microorganisms, additional recommendations have been provided in the revised PDA Technical Report No. 33. The changes include a more comprehensive review of the statistical methods to be used to analyze data obtained during validation. This paper applies those statistical methods to analyze accuracy, precision, ruggedness, and equivalence data obtained using a rapid microbiological method system being validated for water bioburden testing. The case study demonstrates that the statistical methods described in the PDA Technical Report No. 33 chapter can be successfully applied to rapid microbiological method data sets and give the same comparability results for similarity or difference as the standard method. © PDA, Inc

  3. TOPEX Microwave Radiometer - Thermal design verification test and analytical model validation

    NASA Technical Reports Server (NTRS)

    Lin, Edward I.

    1992-01-01

    The testing of the TOPEX Microwave Radiometer (TMR) is described in terms of hardware development based on the modeling and thermal vacuum testing conducted. The TMR and the vacuum-test facility are described, and the thermal verification test includes a hot steady-state segment, a cold steady-state segment, and a cold survival mode segment totalling 65 hours. A graphic description is given of the test history which is related temperature tracking, and two multinode TMR test-chamber models are compared to the test results. Large discrepancies between the test data and the model predictions are attributed to contact conductance, effective emittance from the multilayer insulation, and heat leaks related to deviations from the flight configuration. The TMR thermal testing/modeling effort is shown to provide technical corrections for the procedure outlined, and the need for validating predictive models is underscored.

  4. A Structured Clinical Interview for Kleptomania (SCI-K): preliminary validity and reliability testing.

    PubMed

    Grant, Jon E; Kim, Suck Won; McCabe, James S

    2006-06-01

    Kleptomania presents difficulties in diagnosis for clinicians. This study aimed to develop and test a DSM-IV-based diagnostic instrument for kleptomania. To assess for current kleptomania the Structured Clinical Interview for Kleptomania (SCI-K) was administered to 112 consecutive subjects requesting psychiatric outpatient treatment for a variety of disorders. Reliability and validity were determined. Classification accuracy was examined using the longitudinal course of illness. The SCI-K demonstrated excellent test-retest (Phi coefficient = 0.956 (95% CI = 0.937, 0.970)) and inter-rater reliability (phi coefficient = 0.718 (95% CI = 0.506, 0.848)) in the diagnosis of kleptomania. Concurrent validity was observed with a self-report measure using DSM-IV kleptomania criteria (phi coefficient = 0.769 (95% CI = 0.653, 0.850)). Discriminant validity was observed with a measure of depression (point biserial coefficient = -0.020 (95% CI = -0.205, 0.166)). The SCI-K demonstrated both high sensitivity and specificity based on longitudinal assessment. The SCI-K demonstrated excellent reliability and validity in diagnosing kleptomania in subjects presenting with various psychiatric problems. These findings require replication in larger groups, including non-psychiatric populations, to examine their generalizability. Copyright (c) 2006 John Wiley & Sons, Ltd.

  5. Analytical Validation of the ReEBOV Antigen Rapid Test for Point-of-Care Diagnosis of Ebola Virus Infection.

    PubMed

    Cross, Robert W; Boisen, Matthew L; Millett, Molly M; Nelson, Diana S; Oottamasathien, Darin; Hartnett, Jessica N; Jones, Abigal B; Goba, Augustine; Momoh, Mambu; Fullah, Mohamed; Bornholdt, Zachary A; Fusco, Marnie L; Abelson, Dafna M; Oda, Shunichiro; Brown, Bethany L; Pham, Ha; Rowland, Megan M; Agans, Krystle N; Geisbert, Joan B; Heinrich, Megan L; Kulakosky, Peter C; Shaffer, Jeffrey G; Schieffelin, John S; Kargbo, Brima; Gbetuwa, Momoh; Gevao, Sahr M; Wilson, Russell B; Saphire, Erica Ollmann; Pitts, Kelly R; Khan, Sheik Humarr; Grant, Donald S; Geisbert, Thomas W; Branco, Luis M; Garry, Robert F

    2016-10-15

    Ebola virus disease (EVD) is a severe viral illness caused by Ebola virus (EBOV). The 2013-2016 EVD outbreak in West Africa is the largest recorded, with >11 000 deaths. Development of the ReEBOV Antigen Rapid Test (ReEBOV RDT) was expedited to provide a point-of-care test for suspected EVD cases. Recombinant EBOV viral protein 40 antigen was used to derive polyclonal antibodies for RDT and enzyme-linked immunosorbent assay development. ReEBOV RDT limits of detection (LOD), specificity, and interference were analytically validated on the basis of Food and Drug Administration (FDA) guidance. The ReEBOV RDT specificity estimate was 95% for donor serum panels and 97% for donor whole-blood specimens. The RDT demonstrated sensitivity to 3 species of Ebolavirus (Zaire ebolavirus, Sudan ebolavirus, and Bundibugyo ebolavirus) associated with human disease, with no cross-reactivity by pathogens associated with non-EBOV febrile illness, including malaria parasites. Interference testing exhibited no reactivity by medications in common use. The LOD for antigen was 4.7 ng/test in serum and 9.4 ng/test in whole blood. Quantitative reverse transcription-polymerase chain reaction testing of nonhuman primate samples determined the range to be equivalent to 3.0 × 10 5 -9.0 × 10 8 genomes/mL. The analytical validation presented here contributed to the ReEBOV RDT being the first antigen-based assay to receive FDA and World Health Organization emergency use authorization for this EVD outbreak, in February 2015. © The Author 2016. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail journals.permissions@oup.com.

  6. Do placebo based validation standards mimic real batch products behaviour? Case studies.

    PubMed

    Bouabidi, A; Talbi, M; Bouklouze, A; El Karbane, M; Bourichi, H; El Guezzar, M; Ziemons, E; Hubert, Ph; Rozet, E

    2011-06-01

    Analytical methods validation is a mandatory step to evaluate the ability of developed methods to provide accurate results for their routine application. Validation usually involves validation standards or quality control samples that are prepared in placebo or reconstituted matrix made of a mixture of all the ingredients composing the drug product except the active substance or the analyte under investigation. However, one of the main concerns that can be made with this approach is that it may lack an important source of variability that come from the manufacturing process. The question that remains at the end of the validation step is about the transferability of the quantitative performance from validation standards to real authentic drug product samples. In this work, this topic is investigated through three case studies. Three analytical methods were validated using the commonly spiked placebo validation standards at several concentration levels as well as using samples coming from authentic batch samples (tablets and syrups). The results showed that, depending on the type of response function used as calibration curve, there were various degrees of differences in the results accuracy obtained with the two types of samples. Nonetheless the use of spiked placebo validation standards was showed to mimic relatively well the quantitative behaviour of the analytical methods with authentic batch samples. Adding these authentic batch samples into the validation design may help the analyst to select and confirm the most fit for purpose calibration curve and thus increase the accuracy and reliability of the results generated by the method in routine application. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. Arc Jet Testing of Thermal Protection Materials: 3 Case Studies

    NASA Technical Reports Server (NTRS)

    Johnson, Sylvia; Conley, Joe

    2015-01-01

    Arc jet testing is used to simulate entry to test thermal protection materials. This paper discusses the usefulness of arc jet testing for 3 cases. Case 1 is MSL and PICA, Case 2 is Advanced TUFROC, and Case 3 is conformable ablators.

  8. [Validity of AUDIT test for detection of disorders related with alcohol consumption in women].

    PubMed

    Pérula-de Torres, Luis Angel; Fernández-García, José Angel; Arias-Vega, Raquel; Muriel-Palomino, María; Márquez-Rebollo, Encarnación; Ruiz-Moral, Roger

    2005-11-26

    Early detection of patients with alcohol problems is important in clinical practice. The AUDIT (Alcohol Use Disorders Identification Test) questionnaire is a valid tool for this aim, especially in the male population. The objective of this study was to validate how useful is this questionnaire in females patients and to assess their test cut-off point for the diagnosis of alcohol problems in women. 414 woman were recruited in 2 health center and specialized center for addiction treatment. The AUDIT test and a semistructured interview (SCAN as gold standard) were performed to all patients. Internal consistency and criteria validity was assessed. Cronbach alpha was 0.93 (95% confidence interval [CI], 0.921-0.941). When the DSM-IV was taken as reference the most useful cut-off point was 6 points, with 89.6% (95% CI, 76.11-96.02) sensitivity and 95.07% (95% CI, 92.18-96.97) specificity. When CIE-10 was taken as reference the sensitivity was 89.58% (95% CI, 76.56-96.10) and the specificity was 95.33% (95% CI, 92.48-97.17). AUDIT is a questionnaire with good psychometrics properties and is valid for detecting dependence and risk alcohol consumption in women.

  9. Real-Time Sensor Validation, Signal Reconstruction, and Feature Detection for an RLV Propulsion Testbed

    NASA Technical Reports Server (NTRS)

    Jankovsky, Amy L.; Fulton, Christopher E.; Binder, Michael P.; Maul, William A., III; Meyer, Claudia M.

    1998-01-01

    A real-time system for validating sensor health has been developed in support of the reusable launch vehicle program. This system was designed for use in a propulsion testbed as part of an overall effort to improve the safety, diagnostic capability, and cost of operation of the testbed. The sensor validation system was designed and developed at the NASA Lewis Research Center and integrated into a propulsion checkout and control system as part of an industry-NASA partnership, led by Rockwell International for the Marshall Space Flight Center. The system includes modules for sensor validation, signal reconstruction, and feature detection and was designed to maximize portability to other applications. Review of test data from initial integration testing verified real-time operation and showed the system to perform correctly on both hard and soft sensor failure test cases. This paper discusses the design of the sensor validation and supporting modules developed at LeRC and reviews results obtained from initial test cases.

  10. Validating a dance-specific screening test for balance: preliminary results from multisite testing.

    PubMed

    Batson, Glenna

    2010-09-01

    Few dance-specific screening tools adequately capture balance. The aim of this study was to administer and modify the Star Excursion Balance Test (oSEBT) to examine its utility as a balance screen for dancers. The oSEBT involves standing on one leg while lightly targeting with the opposite foot to the farthest distance along eight spokes of a star-shaped grid. This task simulates dance in the spatial pattern and movement quality of the gesturing limb. The oSEBT was validated for distance on athletes with history of ankle sprain. Thirty-three dancers (age 20.1 +/- 1.4 yrs) participated from two contemporary dance conservatories (UK and US), with or without a history of lower extremity injury. Dancers were verbally instructed (without physical demonstration) to execute the oSEBT and four modifications (mSEBT): timed (speed), timed with cognitive interference (answering questions aloud), and sensory disadvantaging (foam mat). Stepping strategies were tracked and performance strategies video-recorded. Unlike the oSEBT results, distances reached were not significant statistically (p = 0.05) or descriptively (i.e., shorter) for either group. Performance styles varied widely, despite sample homogeneity and instructions to control for strategy. Descriptive analysis of mSEBT showed an increased number of near-falls and decreased timing on the injured limb. Dancers appeared to employ variable strategies to keep balance during this test. Quantitative analysis is warranted to define balance strategies for further validation of SEBT modifications to determine its utility as a balance screening tool.

  11. Assessment of a condition-specific quality-of-life measure for patients with developmentally absent teeth: validity and reliability testing.

    PubMed

    Akram, A J; Ireland, A J; Postlethwaite, K C; Sandy, J R; Jerreat, A S

    2013-11-01

    This article describes the process of validity and reliability testing of a condition-specific quality-of-life measure for patients with hypodontia presenting for orthodontic treatment. The development of the instrument is described in a previous article. Royal Devon and Exeter NHS Foundation Trust & Musgrove Park Hospital, Taunton. The child perception questionnaire was used as a standard against which to test criterion validity. The Bland and Altman method was used to check agreement between the two questionnaires. Construct validity was tested using principal component analysis on the four sections of the questionnaire. Test-retest reliability was tested using intraclass correlation coefficient and Bland and Altman method. Cronbach's alpha was used to test internal consistency reliability. Overall the questionnaire showed good reliability, criterion and construct validity. This together with previous evidence of good face and content validity suggests that the instrument may prove useful in clinical practice and further research. This study has demonstrated that the newly developed condition-specific quality-of-life questionnaire is both valid and reliable for use in young patients with hypodontia. © 2013 John Wiley & Sons A/S. Published by Blackwell Publishing Ltd.

  12. Portuguese-language version of the COPD Assessment Test: validation for use in Brazil*

    PubMed Central

    da Silva, Guilherme Pinheiro Ferreira; Morano, Maria Tereza Aguiar Pessoa; Viana, Cyntia Maria Sampaio; Magalhães, Clarissa Bentes de Araujo; Pereira, Eanes Delgado Barros

    2013-01-01

    OBJECTIVE: To validate a Portuguese-language version of the COPD assessment test (CAT) for use in Brazil and to assess the reproducibility of this version. METHODS: This was multicenter study involving patients with stable COPD at two teaching hospitals in the city of Fortaleza, Brazil. Two independent observers (twice in one day) administered the Portuguese-language version of the CAT to 50 patients with COPD. One of those observers again administered the scale to the same patients one week later. At baseline, the patients were submitted to pulmonary function testing and the six-minute walk test (6MWT), as well as completing the previously validated Portuguese-language versions of the Saint George's Respiratory Questionnaire (SGRQ), modified Medical Research Council (MMRC) dyspnea scale, and hospital anxiety and depression scale (HADS). RESULTS: Inter-rater and intra-rater reliability was excellent (intraclass correlation coefficient [ICC] = 0.96; 95% CI: 0.93-0.97; p < 0.001; and ICC = 0.98; 95% CI: 0.96-0.98; p < 0.001, respectively). Bland Altman plots showed good test-retest reliability. The CAT total score correlated significantly with spirometry results, 6MWT distance, SGRQ scores, MMRC dyspnea scale scores, and HADS-depression scores. CONCLUSIONS: The Portuguese-language version of the CAT is a valid, reproducible, and reliable instrument for evaluating patients with COPD in Brazil. PMID:24068260

  13. Analytic Validation of Immunohistochemistry Assays: New Benchmark Data From a Survey of 1085 Laboratories.

    PubMed

    Stuart, Lauren N; Volmar, Keith E; Nowak, Jan A; Fatheree, Lisa A; Souers, Rhona J; Fitzgibbons, Patrick L; Goldsmith, Jeffrey D; Astles, J Rex; Nakhleh, Raouf E

    2017-09-01

    - A cooperative agreement between the College of American Pathologists (CAP) and the United States Centers for Disease Control and Prevention was undertaken to measure laboratories' awareness and implementation of an evidence-based laboratory practice guideline (LPG) on immunohistochemical (IHC) validation practices published in 2014. - To establish new benchmark data on IHC laboratory practices. - A 2015 survey on IHC assay validation practices was sent to laboratories subscribed to specific CAP proficiency testing programs and to additional nonsubscribing laboratories that perform IHC testing. Specific questions were designed to capture laboratory practices not addressed in a 2010 survey. - The analysis was based on responses from 1085 laboratories that perform IHC staining. Ninety-six percent (809 of 844) always documented validation of IHC assays. Sixty percent (648 of 1078) had separate procedures for predictive and nonpredictive markers, 42.7% (220 of 515) had procedures for laboratory-developed tests, 50% (349 of 697) had procedures for testing cytologic specimens, and 46.2% (363 of 785) had procedures for testing decalcified specimens. Minimum case numbers were specified by 85.9% (720 of 838) of laboratories for nonpredictive markers and 76% (584 of 768) for predictive markers. Median concordance requirements were 95% for both types. For initial validation, 75.4% (538 of 714) of laboratories adopted the 20-case minimum for nonpredictive markers and 45.9% (266 of 579) adopted the 40-case minimum for predictive markers as outlined in the 2014 LPG. The most common method for validation was correlation with morphology and expected results. Laboratories also reported which assay changes necessitated revalidation and their minimum case requirements. - Benchmark data on current IHC validation practices and procedures may help laboratories understand the issues and influence further refinement of LPG recommendations.

  14. Initial validation of a web-based self-administered neuropsychological test battery for older adults and seniors.

    PubMed

    Hansen, Tor Ivar; Haferstrom, Elise Christina D; Brunner, Jan F; Lehn, Hanne; Håberg, Asta Kristine

    2015-01-01

    Computerized neuropsychological tests are effective in assessing different cognitive domains, but are often limited by the need of proprietary hardware and technical staff. Web-based tests can be more accessible and flexible. We aimed to investigate validity, effects of computer familiarity, education, and age, and the feasibility of a new web-based self-administered neuropsychological test battery (Memoro) in older adults and seniors. A total of 62 (37 female) participants (mean age 60.7 years) completed the Memoro web-based neuropsychological test battery and a traditional battery composed of similar tests intended to measure the same cognitive constructs. Participants were assessed on computer familiarity and how they experienced the two batteries. To properly test the factor structure of Memoro, an additional factor analysis in 218 individuals from the HUNT population was performed. Comparing Memoro to traditional tests, we observed good concurrent validity (r = .49-.63). The performance on the traditional and Memoro test battery was consistent, but differences in raw scores were observed with higher scores on verbal memory and lower in spatial memory in Memoro. Factor analysis indicated two factors: verbal and spatial memory. There were no correlations between test performance and computer familiarity after adjustment for age or age and education. Subjects reported that they preferred web-based testing as it allowed them to set their own pace, and they did not feel scrutinized by an administrator. Memoro showed good concurrent validity compared to neuropsychological tests measuring similar cognitive constructs. Based on the current results, Memoro appears to be a tool that can be used to assess cognitive function in older and senior adults. Further work is necessary to ascertain its validity and reliability.

  15. Further validation of the gratitude, resentment, and appreciation test (GRAT).

    PubMed

    Diessner, Rhett; Lewis, Gay

    2007-08-01

    The authors conducted this study to further validate the revised short form of the Gratitude, Resentment, and Appreciation Test by investigating the relationship between GRAT-measured gratitude and two other constructs: (a) spiritual transcendence and (b) materialism. As predicted, both the GRAT and its subscales correlated positively with a measure of spiritual transcendence and negatively with a measure of materialism.

  16. Testing Math or Testing Language? The Construct Validity of the KeyMath-Revised for Children With Intellectual Disability and Language Difficulties.

    PubMed

    Rhodes, Katherine T; Branum-Martin, Lee; Morris, Robin D; Romski, MaryAnn; Sevcik, Rose A

    2015-11-01

    Although it is often assumed that mathematics ability alone predicts mathematics test performance, linguistic demands may also predict achievement. This study examined the role of language in mathematics assessment performance for children with intellectual disability (ID) at less severe levels, on the KeyMath-Revised Inventory (KM-R) with a sample of 264 children, in grades 2-5. Using confirmatory factor analysis, the hypothesis that the KM-R would demonstrate discriminant validity with measures of language abilities in a two-factor model was compared to two plausible alternative models. Results indicated that KM-R did not have discriminant validity with measures of children's language abilities and was a multidimensional test of both mathematics and language abilities for this population of test users. Implications are considered for test development, interpretation, and intervention.

  17. Five-Kilometers Time Trial: Preliminary Validation of a Short Test for Cycling Performance Evaluation.

    PubMed

    Dantas, Jose Luiz; Pereira, Gleber; Nakamura, Fabio Yuzo

    2015-09-01

    The five-kilometer time trial (TT5km) has been used to assess aerobic endurance performance without further investigation of its validity. This study aimed to perform a preliminary validation of the TT5km to rank well-trained cyclists based on aerobic endurance fitness and assess changes of the aerobic endurance performance. After the incremental test, 20 cyclists (age = 31.3 ± 7.9 years; body mass index = 22.7 ± 1.5 kg/m(2); maximal aerobic power = 360.5 ± 49.5 W) performed the TT5km twice, collecting performance (time to complete, absolute and relative power output, average speed) and physiological responses (heart rate and electromyography activity). The validation criteria were pacing strategy, absolute and relative reliability, validity, and sensitivity. Sensitivity index was obtained from the ratio between the smallest worthwhile change and typical error. The TT5km showed high absolute (coefficient of variation < 3%) and relative (intraclass coefficient correlation > 0.95) reliability of performance variables, whereas it presented low reliability of physiological responses. The TT5km performance variables were highly correlated with the aerobic endurance indices obtained from incremental test (r > 0.70). These variables showed adequate sensitivity index (> 1). TT5km is a valid test to rank the aerobic endurance fitness of well-trained cyclists and to differentiate changes on aerobic endurance performance. Coaches can detect performance changes through either absolute (± 17.7 W) or relative power output (± 0.3 W.kg(-1)), the time to complete the test (± 13.4 s) and the average speed (± 1.0 km.h(-1)). Furthermore, TT5km performance can also be used to rank the athletes according to their aerobic endurance fitness.

  18. Development, test-retest reliability and validity of the Pharmacy Value-Added Services Questionnaire (PVASQ)

    PubMed Central

    Tan, Christine L.; Hassali, Mohamed A.; Saleem, Fahad; Shafie, Asrul A.; Aljadhey, Hisham; Gan, Vincent B.

    2015-01-01

    Objective: (i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Methods: Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach’s alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. Results: The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach’ s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. Conclusions: This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients’ intention to adopt pharmacy value-added services to collect partial medicine

  19. Development, test-retest reliability and validity of the Pharmacy Value-Added Services Questionnaire (PVASQ).

    PubMed

    Tan, Christine L; Hassali, Mohamed A; Saleem, Fahad; Shafie, Asrul A; Aljadhey, Hisham; Gan, Vincent B

    2015-01-01

    (i) To develop the Pharmacy Value-Added Services Questionnaire (PVASQ) using emerging themes generated from interviews. (ii) To establish reliability and validity of questionnaire instrument. Using an extended Theory of Planned Behavior as the theoretical model, face-to-face interviews generated salient beliefs of pharmacy value-added services. The PVASQ was constructed initially in English incorporating important themes and later translated into the Malay language with forward and backward translation. Intention (INT) to adopt pharmacy value-added services is predicted by attitudes (ATT), subjective norms (SN), perceived behavioral control (PBC), knowledge and expectations. Using a 7-point Likert-type scale and a dichotomous scale, test-retest reliability (N=25) was assessed by administrating the questionnaire instrument twice at an interval of one week apart. Internal consistency was measured by Cronbach's alpha and construct validity between two administrations was assessed using the kappa statistic and the intraclass correlation coefficient (ICC). Confirmatory Factor Analysis, CFA (N=410) was conducted to assess construct validity of the PVASQ. The kappa coefficients indicate a moderate to almost perfect strength of agreement between test and retest. The ICC for all scales tested for intra-rater (test-retest) reliability was good. The overall Cronbach' s alpha (N=25) is 0.912 and 0.908 for the two time points. The result of CFA (N=410) showed most items loaded strongly and correctly into corresponding factors. Only one item was eliminated. This study is the first to develop and establish the reliability and validity of the Pharmacy Value-Added Services Questionnaire instrument using the Theory of Planned Behavior as the theoretical model. The translated Malay language version of PVASQ is reliable and valid to predict Malaysian patients' intention to adopt pharmacy value-added services to collect partial medicine supply.

  20. Validity of the Timed Up and Go Test as a Measure of Functional Mobility in Persons With Multiple Sclerosis.

    PubMed

    Sebastião, Emerson; Sandroff, Brian M; Learmonth, Yvonne C; Motl, Robert W

    2016-07-01

    To examine the validity of the timed Up and Go (TUG) test as a measure of functional mobility in persons with multiple sclerosis (MS) by using a comprehensive framework based on construct validity (ie, convergent and divergent validity). Cross-sectional study. Hospital setting. Community-residing persons with MS (N=47). Not applicable. Main outcome measures included the TUG test, timed 25-foot walk test, 6-minute walk test, Multiple Sclerosis Walking Scale-12, Late-Life Function and Disability Instrument, posturography evaluation, Activities-specific Balance Confidence scale, Symbol Digits Modalities Test, Expanded Disability Status Scale, and the number of steps taken per day. The TUG test was strongly associated with other valid outcome measures of ambulatory mobility (Spearman rank correlation, rs=.71-.90) and disability status (rs=.80), moderately to strongly associated with balance confidence (rs=.66), and weakly associated with postural control (ie, balance) (rs=.31). The TUG test was moderately associated with cognitive processing speed (rs=.59), but not associated with other nonambulatory measures (ie, Late-Life Function and Disability Instrument-upper extremity function). Our findings support the validity of the TUG test as a measure of functional mobility. This warrants its inclusion in patients' assessment alongside other valid measures of functional mobility in both clinical and research practice in persons with MS. Copyright © 2016 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  1. Building validation tools for knowledge-based systems

    NASA Technical Reports Server (NTRS)

    Stachowitz, R. A.; Chang, C. L.; Stock, T. S.; Combs, J. B.

    1987-01-01

    The Expert Systems Validation Associate (EVA), a validation system under development at the Lockheed Artificial Intelligence Center for more than a year, provides a wide range of validation tools to check the correctness, consistency and completeness of a knowledge-based system. A declarative meta-language (higher-order language), is used to create a generic version of EVA to validate applications written in arbitrary expert system shells. The architecture and functionality of EVA are presented. The functionality includes Structure Check, Logic Check, Extended Structure Check (using semantic information), Extended Logic Check, Semantic Check, Omission Check, Rule Refinement, Control Check, Test Case Generation, Error Localization, and Behavior Verification.

  2. Retesting The Validity Of A Specific Field Test For Judo Training

    PubMed Central

    Santos, Luis; González, Vicente; Iscar, Marta; Brime, Juan I.; Fernández-Río, Javier; Rodríguez, Blanca; Montoliu, Mª Ángeles

    2011-01-01

    The main goal of this research project was to retest the validity of a specifically designed judo field test (Santos Test) in a different group of judokas. Eight (n=8) national-level male judokas underwent laboratory and field testing. The mean data (mean +/− SD) obtained in the laboratory tests was: HRmax: 200 ± 4.0 beats × min−1, VO2 max: 52.8 ± 7.9 ± ml × kg−1 × min−1, lactate max: 12 ± 2.5 mmol × l−1, HR at the anaerobic threshold: 174.2 ± 9.4 beats × min−1, percentage of maximum heart rate at which the anaerobic threshold appears: 87 ± 3.6 %, lactate threshold: 4.0 ± 0.2 mmol × l−1, and RPE: 17.2 ± 1.0. The mean data obtained in the field test (Santos) was: HRmax: 201.3 ± 4.1 beats × min−1, VO2 max: 55.6 ± 5.8 ml × kg−1 × min−1, lactate max: 15.6 ± 2.8 mmol × l−1, HR at the anaerobic threshold: 173.2 ± 4.3 beats × min−1, percentage of maximum heart rate at which the anaerobic threshold appears: 86 ± 2.5 %, lactate threshold: 4.0 ± 0.2 mmol × l−1, and RPE: 16.7 ± 1.0. There were no significant differences between the data obtained on both tests in any of the parameters, except for maximum lactate concentration. Therefore, the Santos test can be considered a valid tool specific for judo training. PMID:23486994

  3. Reliability and validity of the test of incremental respiratory endurance measures of inspiratory muscle performance in COPD.

    PubMed

    Formiga, Magno F; Roach, Kathryn E; Vital, Isabel; Urdaneta, Gisel; Balestrini, Kira; Calderon-Candelario, Rafael A; Campos, Michael A; Cahalin, Lawrence P

    2018-01-01

    The Test of Incremental Respiratory Endurance (TIRE) provides a comprehensive assessment of inspiratory muscle performance by measuring maximal inspiratory pressure (MIP) over time. The integration of MIP over inspiratory duration (ID) provides the sustained maximal inspiratory pressure (SMIP). Evidence on the reliability and validity of these measurements in COPD is not currently available. Therefore, we assessed the reliability, responsiveness and construct validity of the TIRE measures of inspiratory muscle performance in subjects with COPD. Test-retest reliability, known-groups and convergent validity assessments were implemented simultaneously in 81 male subjects with mild to very severe COPD. TIRE measures were obtained using the portable PrO2 device, following standard guidelines. All TIRE measures were found to be highly reliable, with SMIP demonstrating the strongest test-retest reliability with a nearly perfect intraclass correlation coefficient (ICC) of 0.99, while MIP and ID clustered closely together behind SMIP with ICC values of about 0.97. Our findings also demonstrated known-groups validity of all TIRE measures, with SMIP and ID yielding larger effect sizes when compared to MIP in distinguishing between subjects of different COPD status. Finally, our analyses confirmed convergent validity for both SMIP and ID, but not MIP. The TIRE measures of MIP, SMIP and ID have excellent test-retest reliability and demonstrated known-groups validity in subjects with COPD. SMIP and ID also demonstrated evidence of moderate convergent validity and appear to be more stable measures in this patient population than the traditional MIP.

  4. Design and Validation of a Straight-Copy Typewriting Prognostic Test Using Kinesthetic Sensitivity.

    ERIC Educational Resources Information Center

    Olson, Norma Jean

    1979-01-01

    Describes the development and application of a kinesthetic sensitivity test to determine whether it is a valid and reliable measure of straight-copy typing speed and accuracy. The author states that this kinesthetic sensitivity instrument may be used as a prognostic aptitude test and recommends administration methods. (MF)

  5. Automation Hooks Architecture for Flexible Test Orchestration - Concept Development and Validation

    NASA Technical Reports Server (NTRS)

    Lansdowne, C. A.; Maclean, John R.; Winton, Chris; McCartney, Pat

    2011-01-01

    The Automation Hooks Architecture Trade Study for Flexible Test Orchestration sought a standardized data-driven alternative to conventional automated test programming interfaces. The study recommended composing the interface using multicast DNS (mDNS/SD) service discovery, Representational State Transfer (Restful) Web Services, and Automatic Test Markup Language (ATML). We describe additional efforts to rapidly mature the Automation Hooks Architecture candidate interface definition by validating it in a broad spectrum of applications. These activities have allowed us to further refine our concepts and provide observations directed toward objectives of economy, scalability, versatility, performance, severability, maintainability, scriptability and others.

  6. Using a Bedside Video-assisted Test Tube Test to Assess Stoma Viability: A Report of 4 Cases.

    PubMed

    Ahmad, Sarwat; Turner, Keli; Shah, Paulesh; Diaz, Jose

    2016-07-01

    Mucosal discoloration of an intestinal stoma may indicate self-limited venous congestion or necrosis necessitating operative revision. A common bedside technique to assess stoma viability is the "test tube test". A clear tube is inserted into the stoma and a hand-held light is used to assess the color of the stoma. A technique (video-assisted test tube test [VATTT]) developed by the authors utilizes a standard video bronchoscope inserted into a clear plastic blood collection tube to visually inspect and assess the mucosa. This technique was evaluated in 4 patients (age range 49-72 years, all critically ill) with a discolored stoma after emergency surgery. In each case, physical exam revealed ischemic mucosa at the surface either immediately after surgery or after worsening hypotension weeks later. Serial test tube test assessments were ambiguous when trying to assess deeper mucosa. The VATTT assessment showed viable pink mucosa beneath the surface and until the fascia was revealed in 3 patients. One (1) patient had mucosal ischemia down to the fascia, which prompted operative revision of the stoma. The new stoma was assessed with a VATTT and was viable for the entire length of the stoma. VATTT provided an enhanced, magnified, and clearer way to visually assess stoma viability in the postoperative period that can be performed at the bedside with no adverse events. It may prevent unnecessary relaparotomy or enable earlier diagnosis of deep ostomy necrosis. Validity and reliability studies are warranted.

  7. V-SUIT Model Validation Using PLSS 1.0 Test Results

    NASA Technical Reports Server (NTRS)

    Olthoff, Claas

    2015-01-01

    The dynamic portable life support system (PLSS) simulation software Virtual Space Suit (V-SUIT) has been under development at the Technische Universitat Munchen since 2011 as a spin-off from the Virtual Habitat (V-HAB) project. The MATLAB(trademark)-based V-SUIT simulates space suit portable life support systems and their interaction with a detailed and also dynamic human model, as well as the dynamic external environment of a space suit moving on a planetary surface. To demonstrate the feasibility of a large, system level simulation like V-SUIT, a model of NASA's PLSS 1.0 prototype was created. This prototype was run through an extensive series of tests in 2011. Since the test setup was heavily instrumented, it produced a wealth of data making it ideal for model validation. The implemented model includes all components of the PLSS in both the ventilation and thermal loops. The major components are modeled in greater detail, while smaller and ancillary components are low fidelity black box models. The major components include the Rapid Cycle Amine (RCA) CO2 removal system, the Primary and Secondary Oxygen Assembly (POS/SOA), the Pressure Garment System Volume Simulator (PGSVS), the Human Metabolic Simulator (HMS), the heat exchanger between the ventilation and thermal loops, the Space Suit Water Membrane Evaporator (SWME) and finally the Liquid Cooling Garment Simulator (LCGS). Using the created model, dynamic simulations were performed using same test points also used during PLSS 1.0 testing. The results of the simulation were then compared to the test data with special focus on absolute values during the steady state phases and dynamic behavior during the transition between test points. Quantified simulation results are presented that demonstrate which areas of the V-SUIT model are in need of further refinement and those that are sufficiently close to the test results. Finally, lessons learned from the modelling and validation process are given in combination

  8. Case finding of lifestyle and mental health disorders in primary care: validation of the ‘CHAT’ tool

    PubMed Central

    Goodyear-Smith, Felicity; Coupe, Nicole M; Arroll, Bruce; Elley, C Raina; Sullivan, Sean; McGill, Anne-Thea

    2008-01-01

    Background Primary care is accessible and ideally placed for case finding of patients with lifestyle and mental health risk factors and subsequent intervention. The short self-administered Case-finding and Help Assessment Tool (CHAT) was developed for lifestyle and mental health assessment of adult patients in primary health care. This tool checks for tobacco use, alcohol and other drug misuse, problem gambling, depression, anxiety and stress, abuse, anger problems, inactivity, and eating disorders. It is well accepted by patients, GPs and nurses. Aim To assess criterion-based validity of CHAT against a composite gold standard. Design of study Conducted according to the Standards for Reporting of Diagnostic Accuracy statement for diagnostic tests. Setting Primary care practices in Auckland, New Zealand. Method One thousand consecutive adult patients completed CHAT and a composite gold standard. Sensitivities, specificities, positive and negative predictive values, and likelihood ratios were calculated. Results Response rates for each item ranged from 79.6 to 99.8%. CHAT was sensitive and specific for almost all issues screened, except exercise and eating disorders. Sensitivity ranged from 96% (95% confidence interval [CI] = 87 to 99%) for major depression to 26% (95% CI = 22 to 30%) for exercise. Specificity ranged from 97% (95% CI = 96 to 98%) for problem gambling and problem drug use to 40% (95% CI = 36 to 45%) for exercise. All had high likelihood ratios (3–30), except exercise and eating disorders. Conclusion CHAT is a valid and acceptable case-finding tool for most common lifestyle and mental health conditions. PMID:18186993

  9. Validation of EncephalApp, Smartphone-based Stroop Test, for the Diagnosis of Covert Hepatic Encephalopathy

    PubMed Central

    Bajaj, Jasmohan S; Heuman, Douglas M; Sterling, Richard K; Sanyal, Arun J; Siddiqui, Muhammad; Matherly, Scott; Luketic, Velimir; Stravitz, R Todd; Fuchs, Michael; Thacker, Leroy R; Gilles, HoChong; White, Melanie B; Unser, Ariel; Hovermale, James; Gavis, Edith; Noble, Nicole A; Wade, James B

    2014-01-01

    Background & Aims Detection of covert hepatic encephalopathy (CHE) is difficult but point of care testing could increase rates of diagnosis. We aimed to validate the ability of the smartphone app EncephalApp, a streamlined version of Stroop App, to detect CHE. We evaluated face validity, test–retest reliability, and external validity. Methods Patients with cirrhosis (n=167; 38% with overt HE [OHE]; mean age, 55 years; mean model for end-stage liver disease score, 12) and controls (n=114) were each given a paper and pencil cognitive battery (standard) along with EncephalApp. EncephalApp has Off and On states; results measured were: OffTime, OnTime, OffTime+OnTime, and number of runs required to complete 5 off and on runs. Thirty-six patients with cirrhosis underwent driving simulation tests, and EncephalApp results were correlated with results. Test–retest reliability was analyzed in a subgroup of patients. The test was performed before and after transjugular intra-hepatic portosystemic shunt placement, before and after correction for hyponatremia, to determine external validity. Results All patients with cirrhosis performed worse on paper and pencil and EncephalApp tests than controls. Patients with cirrhosis and OHE performed worse than those without OHE. Age-dependent EncephalApp cut-offs (younger or older than 45 years) were set. An OffTime+OnTime value of >190 seconds identified all patients with CHE with an area under the receiver operator characteristic (AUROC) value of 0.91; the AUROC value was 0.88 for diagnosis of CHE in those without OHE. EncephalApp times correlated with crashes and illegal turns in driving simulation tests. Test–retest reliability was high (intra-class coefficient, 0.83) among 30 patients retested 1–3 months apart. OffTime+OnTime increased significantly (206 vs 255, P=.007) among 10 patients retested 33±7 days after transjugular intra-hepatic portosystemic shunt placement. OffTime+OnTime decreased significantly (242 vs 225, P

  10. Development and validation of surgical training tool: cystectomy assessment and surgical evaluation (CASE) for robot-assisted radical cystectomy for men.

    PubMed

    Hussein, Ahmed A; Sexton, Kevin J; May, Paul R; Meng, Maxwell V; Hosseini, Abolfazl; Eun, Daniel D; Daneshmand, Siamak; Bochner, Bernard H; Peabody, James O; Abaza, Ronney; Skinner, Eila C; Hautmann, Richard E; Guru, Khurshid A

    2018-04-13

    We aimed to develop a structured scoring tool: cystectomy assessment and surgical evaluation (CASE) that objectively measures and quantifies performance during robot-assisted radical cystectomy (RARC) for men. A multinational 10-surgeon expert panel collaborated towards development and validation of CASE. The critical steps of RARC in men were deconstructed into nine key domains, each assessed by five anchors. Content validation was done utilizing the Delphi methodology. Each anchor was assessed in terms of context, score concordance, and clarity. The content validity index (CVI) was calculated for each aspect. A CVI ≥ 0.75 represented consensus, and this statement was removed from the next round. This process was repeated until consensus was achieved for all statements. CASE was used to assess de-identified videos of RARC to determine reliability and construct validity. Linearly weighted percent agreement was used to assess inter-rater reliability (IRR). A logit model for odds ratio (OR) was used to assess construct validation. The expert panel reached consensus on CASE after four rounds. The final eight domains of the CASE included: pelvic lymph node dissection, development of the peri-ureteral space, lateral pelvic space, anterior rectal space, control of the vascular pedicle, anterior vesical space, control of the dorsal venous complex, and apical dissection. IRR > 0.6 was achieved for all eight domains. Experts outperformed trainees across all domains. We developed and validated a reliable structured, procedure-specific tool for objective evaluation of surgical performance during RARC. CASE may help differentiate novice from expert performances.

  11. Evaluating instruments for quality: testing convergent validity of the consumer emergency care satisfaction scale.

    PubMed

    Davis, Barbara A; Kiesel, Cynthia K; McFarland, Julie; Collard, Adressa; Coston, Kyle; Keeton, Ada

    2005-01-01

    Having reliable and valid instruments is a necessity for nurses and others measuring concepts such as patient satisfaction. The purpose of this article is to describe the use of convergence to test the construct validity of the Davis Consumer Emergency Care Satisfaction Scale (CECSS). Results indicate convergence of the CECSS with the Risser Patient Satisfaction Scale and 2 single-item visual analogue scales, therefore supporting construct validity. Persons measuring patient satisfaction with nurse behaviors in the emergency department can confidently use the CECSS.

  12. Using Frankencerts for Automated Adversarial Testing of Certificate Validation in SSL/TLS Implementations.

    PubMed

    Brubaker, Chad; Jana, Suman; Ray, Baishakhi; Khurshid, Sarfraz; Shmatikov, Vitaly

    2014-01-01

    Modern network security rests on the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols. Distributed systems, mobile and desktop applications, embedded devices, and all of secure Web rely on SSL/TLS for protection against network attacks. This protection critically depends on whether SSL/TLS clients correctly validate X.509 certificates presented by servers during the SSL/TLS handshake protocol. We design, implement, and apply the first methodology for large-scale testing of certificate validation logic in SSL/TLS implementations. Our first ingredient is "frankencerts," synthetic certificates that are randomly mutated from parts of real certificates and thus include unusual combinations of extensions and constraints. Our second ingredient is differential testing: if one SSL/TLS implementation accepts a certificate while another rejects the same certificate, we use the discrepancy as an oracle for finding flaws in individual implementations. Differential testing with frankencerts uncovered 208 discrepancies between popular SSL/TLS implementations such as OpenSSL, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc. Many of them are caused by serious security vulnerabilities. For example, any server with a valid X.509 version 1 certificate can act as a rogue certificate authority and issue fake certificates for any domain, enabling man-in-the-middle attacks against MatrixSSL and GnuTLS. Several implementations also accept certificate authorities created by unauthorized issuers, as well as certificates not intended for server authentication. We also found serious vulnerabilities in how users are warned about certificate validation errors. When presented with an expired, self-signed certificate, NSS, Safari, and Chrome (on Linux) report that the certificate has expired-a low-risk, often ignored error-but not that the connection is insecure against a man-in-the-middle attack. These results demonstrate that automated adversarial testing with frankencerts

  13. Using Frankencerts for Automated Adversarial Testing of Certificate Validation in SSL/TLS Implementations

    PubMed Central

    Brubaker, Chad; Jana, Suman; Ray, Baishakhi; Khurshid, Sarfraz; Shmatikov, Vitaly

    2014-01-01

    Modern network security rests on the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols. Distributed systems, mobile and desktop applications, embedded devices, and all of secure Web rely on SSL/TLS for protection against network attacks. This protection critically depends on whether SSL/TLS clients correctly validate X.509 certificates presented by servers during the SSL/TLS handshake protocol. We design, implement, and apply the first methodology for large-scale testing of certificate validation logic in SSL/TLS implementations. Our first ingredient is “frankencerts,” synthetic certificates that are randomly mutated from parts of real certificates and thus include unusual combinations of extensions and constraints. Our second ingredient is differential testing: if one SSL/TLS implementation accepts a certificate while another rejects the same certificate, we use the discrepancy as an oracle for finding flaws in individual implementations. Differential testing with frankencerts uncovered 208 discrepancies between popular SSL/TLS implementations such as OpenSSL, NSS, CyaSSL, GnuTLS, PolarSSL, MatrixSSL, etc. Many of them are caused by serious security vulnerabilities. For example, any server with a valid X.509 version 1 certificate can act as a rogue certificate authority and issue fake certificates for any domain, enabling man-in-the-middle attacks against MatrixSSL and GnuTLS. Several implementations also accept certificate authorities created by unauthorized issuers, as well as certificates not intended for server authentication. We also found serious vulnerabilities in how users are warned about certificate validation errors. When presented with an expired, self-signed certificate, NSS, Safari, and Chrome (on Linux) report that the certificate has expired—a low-risk, often ignored error—but not that the connection is insecure against a man-in-the-middle attack. These results demonstrate that automated adversarial testing with

  14. Criterion validation of two submaximal aerobic fitness tests, the self-monitoring Fox-walk test and the Åstrand cycle test in people with rheumatoid arthritis.

    PubMed

    Nordgren, Birgitta; Fridén, Cecilia; Jansson, Eva; Österlund, Ted; Grooten, Wilhelmus Johannes; Opava, Christina H; Rickenlund, Anette

    2014-09-17

    Aerobic capacity tests are important to evaluate exercise programs and to encourage individuals to have a physically active lifestyle. Submaximal tests, if proven valid and reliable could be used for estimation of maximal oxygen uptake (VO2max). The purpose of the study was to examine the criterion-validity of the submaximal self-monitoring Fox-walk test and the submaximal Åstrand cycle test against a maximal cycle test in people with rheumatoid arthritis (RA). A secondary aim was to study the influence of different formulas for age predicted maximal heart rate when estimating VO2max by the Åstrand test. Twenty seven subjects (81% female), mean (SD) age 62 (8.1) years, diagnosed with RA since 17.9 (11.7) years, participated in the study. They performed the Fox-walk test (775 meters), the Åstrand test and the maximal cycle test (measured VO2max test). Pearson's correlation coefficients were calculated to determine the direction and strength of the association between the tests, and paired t-tests were used to test potential differences between the tests. Bland and Altman methods were used to assess whether there was any systematic disagreement between the submaximal tests and the maximal test. The correlation between the estimated and measured VO2max values were strong and ranged between r = 0.52 and r = 0.82 including the use of different formulas for age predicted maximal heart rate, when estimating VO2max by the Åstrand test. VO2max was overestimated by 30% by the Fox-walk test and underestimated by 10% by the Åstrand test corrected for age. When the different formulas for age predicted maximal heart rate were used, the results showed that two formulas better predicted maximal heart rate and consequently a more precise estimation of VO2max. Despite the fact that the Fox-walk test overestimated VO2max substantially, the test is a promising method for self-monitoring VO2max and further development of the test is encouraged. The Åstrand test should be

  15. Utility of NIST Whole-Genome Reference Materials for the Technical Validation of a Multigene Next-Generation Sequencing Test.

    PubMed

    Shum, Bennett O V; Henner, Ilya; Belluoccio, Daniele; Hinchcliffe, Marcus J

    2017-07-01

    The sensitivity and specificity of next-generation sequencing laboratory developed tests (LDTs) are typically determined by an analyte-specific approach. Analyte-specific validations use disease-specific controls to assess an LDT's ability to detect known pathogenic variants. Alternatively, a methods-based approach can be used for LDT technical validations. Methods-focused validations do not use disease-specific controls but use benchmark reference DNA that contains known variants (benign, variants of unknown significance, and pathogenic) to assess variant calling accuracy of a next-generation sequencing workflow. Recently, four whole-genome reference materials (RMs) from the National Institute of Standards and Technology (NIST) were released to standardize methods-based validations of next-generation sequencing panels across laboratories. We provide a practical method for using NIST RMs to validate multigene panels. We analyzed the utility of RMs in validating a novel newborn screening test that targets 70 genes, called NEO1. Despite the NIST RM variant truth set originating from multiple sequencing platforms, replicates, and library types, we discovered a 5.2% false-negative variant detection rate in the RM truth set genes that were assessed in our validation. We developed a strategy using complementary non-RM controls to demonstrate 99.6% sensitivity of the NEO1 test in detecting variants. Our findings have implications for laboratories or proficiency testing organizations using whole-genome NIST RMs for testing. Copyright © 2017 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

  16. How'd they do it? Malingering strategies on symptom validity tests.

    PubMed

    Tan, Jing Ee; Slick, Daniel J; Strauss, Esther; Hultsch, David F

    2002-12-01

    Twenty-five undergraduate students were instructed to feign believable impairment following a brain injury from a car accident and 27 students were told to perform like they had recovered from such an injury. Three forced-choice tests, the Test of Memory Malingering (TOMM), Victoria Symptom Validity Test (VSVT), and Word Memory Test (WMT) were given. Test-taking strategies were evaluated by means of a questionnaire given at the end of the test session. The results revealed that all the tasks differentiated between groups. Using conventional cut-scores, the WMT proved most efficient while the VSVT captured the most participants in the definitive below-chance category. Individuals instructed to feign injury were more likely to prepare prior to the experiment, with feigning of memory loss as the most frequently reported strategy. Regardless, preparation effort did not translate into believable performance on the tests.

  17. Validity and Reliability Study of the Korean Tinetti Mobility Test for Parkinson's Disease.

    PubMed

    Park, Jinse; Koh, Seong-Beom; Kim, Hee Jin; Oh, Eungseok; Kim, Joong-Seok; Yun, Ji Young; Kwon, Do-Young; Kim, Younsoo; Kim, Ji Seon; Kwon, Kyum-Yil; Park, Jeong-Ho; Youn, Jinyoung; Jang, Wooyoung

    2018-01-01

    Postural instability and gait disturbance are the cardinal symptoms associated with falling among patients with Parkinson's disease (PD). The Tinetti mobility test (TMT) is a well-established measurement tool used to predict falls among elderly people. However, the TMT has not been established or widely used among PD patients in Korea. The purpose of this study was to evaluate the reliability and validity of the Korean version of the TMT for PD patients. Twenty-four patients diagnosed with PD were enrolled in this study. For the interrater reliability test, thirteen clinicians scored the TMT after watching a video clip. We also used the test-retest method to determine intrarater reliability. For concurrent validation, the unified Parkinson's disease rating scale, Hoehn and Yahr staging, Berg Balance Scale, Timed-Up and Go test, 10-m walk test, and gait analysis by three-dimensional motion capture were also used. We analyzed receiver operating characteristic curve to predict falling. The interrater reliability and intrarater reliability of the Korean Tinetti balance scale were 0.97 and 0.98, respectively. The interrater reliability and intra-rater reliability of the Korean Tinetti gait scale were 0.94 and 0.96, respectively. The Korean TMT scores were significantly correlated with the other clinical scales and three-dimensional motion capture. The cutoff values for predicting falling were 14 points (balance subscale) and 10 points (gait subscale). We found that the Korean version of the TMT showed excellent validity and reliability for gait and balance and had high sensitivity and specificity for predicting falls among patients with PD.

  18. Evolving the Principles and Practice of Validation for New Alternative Approaches to Toxicity Testing.

    PubMed

    Whelan, Maurice; Eskes, Chantra

    Validation is essential for the translation of newly developed alternative approaches to animal testing into tools and solutions suitable for regulatory applications. Formal approaches to validation have emerged over the past 20 years or so and although they have helped greatly to progress the field, it is essential that the principles and practice underpinning validation continue to evolve to keep pace with scientific progress. The modular approach to validation should be exploited to encourage more innovation and flexibility in study design and to increase efficiency in filling data gaps. With the focus now on integrated approaches to testing and assessment that are based on toxicological knowledge captured as adverse outcome pathways, and which incorporate the latest in vitro and computational methods, validation needs to adapt to ensure it adds value rather than hinders progress. Validation needs to be pursued both at the method level, to characterise the performance of in vitro methods in relation their ability to detect any association of a chemical with a particular pathway or key toxicological event, and at the methodological level, to assess how integrated approaches can predict toxicological endpoints relevant for regulatory decision making. To facilitate this, more emphasis needs to be given to the development of performance standards that can be applied to classes of methods and integrated approaches that provide similar information. Moreover, the challenge of selecting the right reference chemicals to support validation needs to be addressed more systematically, consistently and in a manner that better reflects the state of the science. Above all however, validation requires true partnership between the development and user communities of alternative methods and the appropriate investment of resources.

  19. 1:50 Scale Testing of Three Floating Wind Turbines at MARIN and Numerical Model Validation Against Test Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dagher, Habib; Viselli, Anthony; Goupee, Andrew

    The primary goal of the basin model test program discussed herein is to properly scale and accurately capture physical data of the rigid body motions, accelerations and loads for different floating wind turbine platform technologies. The intended use for this data is for performing comparisons with predictions from various aero-hydro-servo-elastic floating wind turbine simulators for calibration and validation. Of particular interest is validating the floating offshore wind turbine simulation capabilities of NREL’s FAST open-source simulation tool. Once the validation process is complete, coupled simulators such as FAST can be used with a much greater degree of confidence in design processesmore » for commercial development of floating offshore wind turbines. The test program subsequently described in this report was performed at MARIN (Maritime Research Institute Netherlands) in Wageningen, the Netherlands. The models considered consisted of the horizontal axis, NREL 5 MW Reference Wind Turbine (Jonkman et al., 2009) with a flexible tower affixed atop three distinct platforms: a tension leg platform (TLP), a spar-buoy modeled after the OC3 Hywind (Jonkman, 2010) and a semi-submersible. The three generic platform designs were intended to cover the spectrum of currently investigated concepts, each based on proven floating offshore structure technology. The models were tested under Froude scale wind and wave loads. The high-quality wind environments, unique to these tests, were realized in the offshore basin via a novel wind machine which exhibits negligible swirl and low turbulence intensity in the flow field. Recorded data from the floating wind turbine models included rotor torque and position, tower top and base forces and moments, mooring line tensions, six-axis platform motions and accelerations at key locations on the nacelle, tower, and platform. A large number of tests were performed ranging from simple free-decay tests to complex operating conditions

  20. An Examination of Reliability and Validity Claims of a Foreign Language Proficiency Test

    ERIC Educational Resources Information Center

    Mircea-Pines, Walter J.

    2009-01-01

    This dissertation study examined the reliability and validity claims of a modified version of the Spanish Modern Language Association Foreign Language Proficiency Test for Teachers and Advanced Students administered at George Mason University (GMU). The study used the 1999 computerized GMU version that was administered to 277 test-takers via…